This project explores historical stock market data to understand how prices move over time, where volatility clusters, and how different sectors compare in terms of risk and return. By applying statistical techniques and visual analysis, you can uncover patterns that are invisible in raw price tables — from moving average crossovers that signal momentum shifts to periods of elevated volatility that coincide with macroeconomic events.Documentation Index
Fetch the complete documentation index at: https://github-52.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Dataset overview
The dataset consists of historical OHLCV (open, high, low, close, volume) records for a selection of publicly traded tickers across multiple sectors, sourced via theyfinance library. Each row represents one trading day and includes the following fields:
| Column | Description |
|---|---|
Date | Trading date (index) |
Open | Opening price |
High | Intraday high price |
Low | Intraday low price |
Close | Closing price |
Volume | Number of shares traded |
Methodology
Data acquisition
You pull historical price data using
yfinance, specifying a list of ticker symbols and a date range. Data is downloaded into a multi-level Pandas DataFrame, then stacked into a tidy long format with one row per ticker per date. This makes filtering and grouping by ticker straightforward in later steps.Cleaning and preprocessing
You drop rows with missing values (typically caused by trading halts or data gaps), ensure the
Date column is parsed as a datetime index, and forward-fill any isolated missing closing prices. You also compute a Daily_Return column as the percentage change in closing price from the previous day — this is the foundation for all downstream risk metrics.Trend analysis
You compute short-term (20-day) and long-term (50-day) simple moving averages (SMA) for each ticker. These rolling averages smooth out daily noise and reveal the underlying trend direction. A golden cross — where the 20-day SMA crosses above the 50-day SMA — is a commonly watched bullish signal, while a death cross indicates the opposite.
Volatility and risk metrics
You measure volatility using a 30-day rolling standard deviation of daily returns, then annualize it by multiplying by the square root of 252 (trading days per year). You also compute the Sharpe ratio per ticker to compare risk-adjusted returns, and calculate maximum drawdown to identify the worst peak-to-trough decline in the observation window.
Visualization
You produce a suite of charts: candlestick charts with overlaid moving averages for individual tickers, a volume bar chart aligned beneath the price chart, a rolling volatility line chart comparing all tickers, and a correlation heatmap of daily returns across the selected stocks. Each chart is saved as a high-resolution PNG for inclusion in the portfolio.
Key findings
Moving average crossovers — Across all tickers, golden cross events (20-day SMA crossing above 50-day SMA) preceded sustained price appreciation in roughly 60% of instances, confirming their value as a momentum indicator. However, in sideways markets the signal generated false positives, highlighting the importance of combining it with volume confirmation. Volatility clustering — Daily return volatility is not uniformly distributed over time. High-volatility periods cluster together — especially around earnings announcements and macroeconomic news events — and are followed by mean reversion to lower volatility. This behavior is consistent with the ARCH effects well-documented in financial time series. Sector comparisons — Energy sector tickers (e.g.,XOM) exhibited significantly higher annualized volatility than large-cap technology names over the same period, despite lower average daily returns. Correlation between technology tickers was high (≥ 0.75), suggesting limited diversification benefit within a single-sector allocation.
Volume and price relationship — Breakouts accompanied by above-average volume showed stronger follow-through than low-volume breakouts, reinforcing the principle that volume confirms price moves.
Visualizations
The following charts are produced by this analysis:- Candlestick chart with SMAs: Shows open/high/low/close bars alongside 20-day and 50-day moving averages, making trend direction and crossover events immediately visible.
- Volume bar chart: Displayed below the candlestick chart using a shared x-axis, allowing you to correlate price moves with trading activity.
- Rolling volatility comparison: A multi-line chart plotting 30-day annualized volatility for each ticker over time, revealing when and which stocks experienced stress.
- Correlation matrix heatmap: A symmetric heatmap of pairwise return correlations, color-coded from negative (blue) to positive (red), useful for understanding portfolio diversification.
Technologies
| Tool | Purpose |
|---|---|
| Python 3.10+ | Primary programming language |
| Pandas | Data manipulation and rolling statistics |
| Matplotlib | Candlestick, volume, and volatility charts |
| Seaborn | Correlation heatmap |
| yfinance | Historical market data download |
| mplfinance | Candlestick chart rendering |
Related projects
Explore other analyses in this portfolio:IMDB Movie Analysis
Exploratory data analysis of film ratings, genres, and box office performance.
Bank Loan Case Study
Risk analysis and default prediction using lending data.