Usage
The package uses the Stocktwits API manages three type of streas: user, symbol and conversation. Now the package manages the user and symbol streams.
There are some parameters that you can use. These are the mandatory parameters:
symbols, you can define a list of symbols that you want to download: this list has to have at least one element or it has to exist the parameter users
users, you can define a list of users that you want to download: this list has to have at least one element or it has to exist the parameter symbols
And these are optionals:
only_combo, when you want to download only the combo between a specific symbol and user, you have to use each previous parameter and this that it is a boolean
min, it is the ID of a specific twit from which you want to start downloading
max, it is the ID of a specific twit where you want to stop downloading
limit, it is the number of messages that you want to download in one shot
start, it is the datetime from which you want to start downloading
chunk, it is the chunk (day, week or month) in which you want to split the data
filename_prefix, it is the prefix name of files where you want to save the data
filename_suffix, it is the suffix name of files where you want to save the data
is_verbose, when you want to print some information to understand what the system is saving, it is a boolean
Without optional parameters, the system downloads the last 30 messages and prints those in the output. If you want to save that on a file (or more files), you have to use at least the chunk parameter.
Examples
Remeber to install the package by pip
pip3 install stocktwits-collector
or by requirements.txt contains one line with stocktwits-collector
pip3 install --upgrade -r requirements.txt
import os
import json
import pandas as pd
from stocktwits_collector.collector import Collector
sc = Collector()
# download last messages up to 30
messages = sc.get_history({'symbols': ['TSLA'], 'limit': 4})
# download the messages from a date to today
messages = sc.get_history({'symbols': ['TSLA'], 'start': '2022-04-04T00:00:00Z'})
# save the messages on files splitted per chunk from a date to max ID
chunk = sc.save_history({'symbols': ['TSLA'], 'start': '2022-04-04T00:00:00Z', 'chunk': 'day'})
# load data from one file
with open('history.20220404.json', 'r') as f:
data = json.loads(f.read())
df = pd.json_normalize(
data,
meta=[
'id', 'body', 'created_at',
['user', 'id'],
['user', 'username'],
['entities', 'sentiment', 'basic']
]
)
twits = df[['id', 'body', 'created_at', 'user.username', 'entities.sentiment.basic']]
# load data from multiple files
frames = []
path = '.'
for file in os.listdir(path):
filename = f"{path}/{file}"
with open(filename, 'r') as f:
data = json.loads(f.read())
frames.append(pd.json_normalize(
data,
meta=[
'id', 'body', 'created_at',
['user', 'id'],
['user', 'username'],
['entities', 'sentiment', 'basic']
]
)
)
df = pd.concat(frames).sort_values(by=['id'])
twits = df[['id', 'body', 'created_at', 'user.username', 'entities.sentiment.basic']]