Profile Module
username_cleaner
def username_cleaner(username: str) -> str
Strips @ symbol from a username.
Example:
@dgnsrekt -> dgnsrekt
Arguments:
username
- username with @ symbol to remove.
Returns:
Username with @ symbol stripped.
link_parser
def link_parser(element: HTML) -> str
Gets the first link from an html element
Used for the profiles website, photo and banner links.
Arguments:
element
- HTML element with a link to parse.
Returns:
First link from a collection of links.
parse_user_id_from_banner
def parse_user_id_from_banner(banner_url: str) -> str
Parses the users id from the users banner photo url.
The user id can only be parsed from the banner photos url.
Example:
/pic/profile_banners%2F2474416796%2F1600567028%2F1500x500 -> 2474416796
^ ^
| |
----------
user id section in banner link
Arguments:
banner_url
- URL of the profiles banner photo.
Returns:
The target profiles user id.
stat_cleaner
def stat_cleaner(stat: str) -> int
Cleans and converts single stat.
Used for the tweets, followers, following, and likes count sections.
Arguments:
stat
- Stat to be cleaned.
Returns:
A stat with commas removed and converted to int.
profile_parser
def profile_parser(elements: Dict) -> Dict
Converts parsed sections to text.
Cleans and processes a dictionary of gathered html elements.
Arguments:
elements
- Elements prepared to clean and convert.
Returns:
A dictionary of element sections cleaned and converted to their finalized types.
html_parser
def html_parser(html: HTML) -> Dict
Parses HTML element into individual sections
Given an html element the html_parser will search for each profile section using CSS selectors. All parsed html elements are gathered into a dictionary and returned.
Arguments:
html
- HTML element from a successful nitter profile scraped response.
Returns:
A dictionary of found elements from the parsed sections.
get_profile
def get_profile(username: str, not_found_ok: bool = False, address: str = "https://nitter.net") -> Optional[Profile]
Scrapes nitter for the target users profile information.
Arguments:
username
- The target profiles username.not_found_ok
- If not_found_ok is false (the default), a ValueError is raised if the target profile doesn't exist. If not_found_ok is true, None will be returned instead.address
- The address to scrape profile data from. The default scrape location is 'https://nitter.net' which should be used as a backup. This value will normally be replaced by the address of a local docker container instance of nitter.
Returns:
Profile object if successfully scraped, otherwise None.
Raises:
ValueError
- If the target profile does not exist and the not_found_ok argument is false.