Py Template

Thursday, May 8, 2025

Web Scraping and Data Extraction - Social Media Hashtag Monitor

Notes:

Problem Solved: Scrapes Twitter for real-time hashtag mentions.
Customization Benefits: Track campaigns, analyze sentiment, or discover influencers.
Further Adoption: Store in a database, analyze sentiment, or trigger alerts.

Python Code:

import snscrape.modules.twitter as sntwitter

def get_tweets_by_hashtag(hashtag, max_tweets=50):

tweets = []

for i, tweet in enumerate(sntwitter.TwitterHashtagScraper(hashtag).get_items()):

if i >= max_tweets:

break

tweets.append({'user': tweet.user.username, 'content': tweet.content})

return tweets

# Example usage

tweets = get_tweets_by_hashtag("AI")

for t in tweets[:5]:

print(f"{t['user']}: {t['content'][:80]}")

Web Scraping and Data Extraction - PDF Invoice Parser

Notes:

Problem Solved: Extracts structured data (like totals, dates) from PDF invoices.
Customization Benefits: Works with invoice templates or billing automation systems.
Further Adoption: Connect to accounting software or ERP platforms.

Python Code:

import pdfplumber

def extract_invoice_data(pdf_path):

with pdfplumber.open(pdf_path) as pdf:

text = pdf.pages[0].extract_text()

lines = text.split('\n')

data = {}

for line in lines:

if "Invoice Number" in line:

data['invoice_number'] = line.split(":")[-1].strip()

elif "Total Amount" in line:

data['total_amount'] = line.split(":")[-1].strip()

elif "Date" in line:

data['date'] = line.split(":")[-1].strip()

return data

# Example usage

# print(extract_invoice_data("invoice_sample.pdf"))

Web Scraping and Data Extraction - Real-Time News Extractor

Notes:

Problem Solved: Extracts headlines from news websites in real time.
Customization Benefits: Filter by topic or sentiment, or push to dashboards.
Further Adoption: Use for trend analysis, sentiment detection, or alert systems.

Python Code:

import feedparser

def get_news_rss(feed_url):

feed = feedparser.parse(feed_url)

return [{'title': entry.title, 'link': entry.link} for entry in feed.entries]

rss_url = "http://feeds.bbci.co.uk/news/rss.xml"

headlines = get_news_rss(rss_url)

for news in headlines[:5]:

print(news['title'], "-", news['link'])

Web Scraping and Data Extraction - Job Listing Aggregator

Notes:

Problem Solved: Extracts job postings from multiple job boards.
Customization Benefits: Filter by keywords, location, or salary.
Further Adoption: Feed into job boards, CRMs, or recruitment analytics platforms.

Python Code:

import requests

from bs4 import BeautifulSoup

def scrape_indeed_jobs(query, location):

base_url = "https://www.indeed.com/jobs"

params = {"q": query, "l": location}

response = requests.get(base_url, params=params)

soup = BeautifulSoup(response.text, 'html.parser')

jobs = []

for job_card in soup.select('.result'):

title = job_card.select_one('h2.jobTitle').text.strip()

company = job_card.select_one('.companyName').text.strip()

jobs.append({'title': title, 'company': company})

return jobs

print(scrape_indeed_jobs("data analyst", "New York, NY"))

Web Scraping and Data Extraction - E-commerce Price Tracker

Notes:

Problem Solved: Tracks product prices across e-commerce sites (e.g., Amazon, Flipkart).
Customization Benefits: Monitor competitors, automate pricing strategies, or trigger alerts.
Further Adoption: Integrate with BI tools, pricing engines, or push notifications.

Python Code:

import requests

from bs4 import BeautifulSoup

def get_amazon_price(product_url, headers):

response = requests.get(product_url, headers=headers)

soup = BeautifulSoup(response.content, 'html.parser')

title = soup.find(id="productTitle").get_text(strip=True)

price = soup.find('span', {'class': 'a-offscreen'}).get_text(strip=True)

return {'title': title, 'price': price}

# Example usage

headers = {'User-Agent': 'Mozilla/5.0'}

url = 'https://www.amazon.com/dp/B08N5WRWNW' # Example product

print(get_amazon_price(url, headers))

Customer Relationship Management (CRM) - Voice of Customer Analyzer

Notes:

Problem Solved: Performs sentiment analysis on customer reviews or NPS responses.
Customization Benefits: Tailor sentiment thresholds or keywords per product.
Further Adoption: Feed results into product improvement or alerting systems.

Python Code:

from textblob import TextBlob

import pandas as pd

def analyze_sentiment(text):

return TextBlob(text).sentiment.polarity

feedback_df = pd.read_csv("customer_feedback.csv") # Column: 'feedback'

feedback_df['sentiment_score'] = feedback_df['feedback'].apply(analyze_sentiment)

# Categorize feedback

feedback_df['sentiment_label'] = feedback_df['sentiment_score'].apply(

lambda x: 'Positive' if x > 0.1 else ('Negative' if x < -0.1 else 'Neutral')

)

print(feedback_df[['feedback', 'sentiment_label']])

Customer Relationship Management (CRM) - Sales Forecasting Tool

Notes:

Problem Solved: Predicts future sales based on pipeline data and historical trends.
Customization Benefits: Incorporate external data like seasonality or macroeconomic factors.
Further Adoption: Display results in BI dashboards or CRM widgets.

Python Code:

import pandas as pd

from sklearn.linear_model import LinearRegression

from sklearn.model_selection import train_test_split

df = pd.read_csv("sales_pipeline.csv") # Columns: 'month', 'opportunities', 'closed_deals'

X = df[['opportunities']]

y = df['closed_deals']

model = LinearRegression()

model.fit(X, y)

# Forecast next month's sales

next_opps = pd.DataFrame({'opportunities': [150]})

forecast = model.predict(next_opps)

print(f"Predicted sales for next month: {forecast[0]:.2f}")

Py Template

Thursday, May 8, 2025

Web Scraping and Data Extraction - Social Media Hashtag Monitor

Notes:

Python Code:

Web Scraping and Data Extraction - PDF Invoice Parser

Notes:

Python Code:

Web Scraping and Data Extraction - Real-Time News Extractor

Notes:

Python Code:

Web Scraping and Data Extraction - Job Listing Aggregator

Notes:

Python Code:

Web Scraping and Data Extraction - E-commerce Price Tracker

Notes:

Python Code:

Customer Relationship Management (CRM) - Voice of Customer Analyzer

Notes:

Python Code:

Customer Relationship Management (CRM) - Sales Forecasting Tool

Notes:

Python Code:

IoT (Internet of Things) Automation - Smart Energy Usage Tracker

Report Abuse