[Dashboard] Interactive Dashboard
Explore data interactively with brand filters, price distribution visualizations, and detailed table of most expensive products.
π Open Interactive Dashboard
Click the button below to explore the interactive Power BI dashboard with full functionality and filters.
π View Dashboard on Power BI*Dashboard created with Power BI. Use filters to explore data by brand.*
The Project
I developed a complete data analysis system to analyze and compare the luxury men's sneakers market, focusing on two iconic brands: Golden Goose and Balenciaga.
The goal? Create an end-to-end pipeline that goes from automated data collection to interactive visualization, through cloud storage. A project demonstrating skills in web scraping, cloud computing, and data visualization.
Methodology & Tech Stack
[Tools] Data Collection: Automated Web Scraping
I developed custom scrapers in Python using Playwright to extract data from brands' official websites. The process includes:
- Automatic handling of cookie consent and overlays
- Dynamic pagination and lazy loading
- Bot detection bypass with stealth techniques
- Structured extraction of 461 products with prices, categories and metadata
[Cloud] Storage: AWS S3
Collected data is automatically uploaded to AWS S3 with:
- Organized structure by brand and date
- Versioning enabled to track changes
- Server-side encryption for security
- Pipeline ready for automatic monitoring
π Analysis: Power BI
Interactive dashboard that allows to:
- Compare average prices by brand
- Explore product distribution
- Identify pricing patterns
- Filter dynamically by brand
Dataset
β’ 359 Golden Goose (range: β¬295 - β¬1,870)
β’ 102 Balenciaga (range: β¬450 - β¬995)
Extracted data includes: product name, price, category, availability, product ID, and timestamp.
Key Insights
Price Positioning
Golden Goose: average price β¬544, consistent mid-luxury positioning with some premium pieces
Balenciaga: average price β¬720, premium positioning with ultra-luxury editions
Product Range
Golden Goose: concentrated range, limited variations
Balenciaga: more diversified portfolio with special editions and collaborations
Premium Products
Top-tier Golden Goose models reach β¬1,870 with premium materials and special editions. Balenciaga premium range caps at β¬995 for designer collaborations
Distribution
Golden Goose: 78% of dataset, extensive product range
Balenciaga: 22%, focused premium collection
Conclusions & Next Steps
This project demonstrates a complete data engineering and analysis workflow:
- β Automated and scalable data collection
- β Cloud storage with AWS best practices
- β Interactive visualizations for business insights
- β System ready for continuous monitoring
Future developments:
- Expansion to other luxury brands (Gucci, Dior, Prada)
- Expansion to women's category
- Alerting system for detecting new products and price changes
- API for programmatic data access
- Predictive analysis on pricing trends
Complete Tech Stack
Backend & Scraping
- Python 3.11
- Playwright (browser automation)
- Pandas (data manipulation)
- BeautifulSoup (HTML parsing)
Cloud Infrastructure
- AWS S3 (data storage)
- AWS IAM (access management)
- Boto3 (AWS SDK)
Visualization & Analysis
- Power BI Desktop
- Power BI Service
- DAX (data modeling)
Development
- Git/GitHub
- Virtual environments
- Environment variables (.env)