Luxury Sneakers Market Analysis

Comprehensive web scraping and data analysis pipeline for luxury footwear brands

Python Web Scraping AWS S3 Power BI Data Pipeline

Project Overview

Built a complete end-to-end data pipeline for analyzing the luxury sneaker market, focusing on premium brands Golden Goose and Balenciaga. The project involved web scraping product data, creating automated data pipelines, storing data in cloud infrastructure, and building interactive dashboards for market insights.

Key Challenges

  • Implemented sophisticated bot detection bypass techniques to successfully scrape protected e-commerce websites
  • Designed scalable data architecture using AWS S3 for efficient storage and retrieval
  • Created automated ETL processes to ensure data freshness and consistency
  • Handled complex product variations, pricing structures, and multilingual content

Technologies Used

Python
BeautifulSoup
Selenium
AWS S3
Power BI
Pandas

Results & Impact

461
Products Analyzed
2
Luxury Brands
100%
Automated Pipeline
View Full Project Details

Luxury Companies Financial Analysis: Kering vs LVMH

Automated AWS Lambda pipeline for PDF extraction and Tableau financial dashboard

AWS Lambda Python PDF Extraction Tableau Financial Analysis

Project Overview

Built a complete automated system to extract and analyze financial data from annual PDF reports of two luxury conglomerates (Kering and LVMH) across 2022-2024. The project uses AWS Lambda for serverless PDF extraction, S3 for cloud storage, and Tableau for interactive financial dashboards with revenue trends, profitability analysis, and segment performance.

Key Challenges

  • Developed custom AWS Lambda function with company-specific regex patterns for accurate data extraction from unstructured PDFs
  • Implemented smart billion/million conversion logic to handle different reporting formats (Kering vs LVMH)
  • Created scalable serverless architecture with S3 storage, versioning, and encryption
  • Built interactive Tableau dashboards comparing revenue trends, margins, and geographic/brand segment performance

Technologies Used

AWS Lambda
AWS S3
Python
PyPDF2
Tableau
RegEx

Results & Impact

6
Annual Reports Analyzed
15+
Financial Metrics Extracted
100%
Automated Pipeline
View Full Project Details