- I hold a higher technical education.
- I specialize in data analysis with a focus on empowering informed decision-making.
- By extracting insights from complex data sets, I help organizations make data-driven decisions that drive business growth and improvement.
- Programming Languages: Python, SQL (PostgreSQL, MySQL, ClickHouse), NoSQL (MongoDB).
- Data Analysis & Visualization:
- Libraries: Pandas, NumPy, SciPy, Statsmodels, Pingouin, Plotly, Matplotlib, Seaborn.
- Tools & Frameworks: Dash, Power BI, Tableau, Redash, DataLens, Superset.
- Big Data & Distributed Computing: Apache Spark, Apache Airflow.
- Machine learning and AI: Scikit-learn, MLlib.
- Time Series Forecasting: Facebook Prophet, Uber Orbit.
- Natural Language Processing: NLTK, SpaCy, TextBlob.
- Web scraping: BeautifulSoup, Selenium, Scrapy.
- DevOps: Linux, Git, Docker.
- IDEs: VS Code, Google Colab, Jupyter Notebook, Zeppelin, PyCharm.
- Deep data analysis:
- Preprocessing, cleaning, and identifying patterns using visualization to support decision-making.
- Writing complex SQL queries:
- Working with nested queries, window functions, CASE and WITH statements for data extraction and analysis.
- Understanding product strategy:
- Knowledge of product development and improvement principles, including analyzing user needs and formulating recommendations for its growth.
- Product metrics analysis:
- LTV, RR, CR, ARPU, ARPPU, MAU, DAU, and other key performance indicators.
- Conducting A/B testing:
- Analyzing results using statistical methods to evaluate the effectiveness of changes.
- Cohort analysis and RFM segmentation:
- Identifying user behavior patterns to optimize marketing strategies.
- End-to-End Data Pipelines:
- Building automated ETL processes from databases to dashboards with Airflow orchestration.
- Data visualization and dashboard development:
- Creating interactive reports in Tableau, Redash, Power BI, and other tools for presenting analytics.
- Web scraping:
- Experience in extracting data from websites using tools and libraries such as BeautifulSoup, Scrapy, and Selenium for information gathering and data analysis.
- Working with big data:
- Experience with tools and technologies for processing large volumes of data (e.g., Hadoop, Spark).
- Machine Learning Applications:
- Capable of building and applying machine learning models for data analysis tasks, including forecasting, classification, and clustering, to uncover deeper insights and enhance decision-making processes.
- Business and Metric Forecasting:
- Building and interpreting time series forecasts for key business metrics using libraries like Uber Orbit and Facebook Prophet for intuitive, robust forecasting to support strategic planning and goal-setting.
- Working with APIs:
- Integrating and extracting data from various sources via APIs.
- Process Automation:
- Automating data workflows and routine tasks using Linux scripting, Apache Airflow and other DevOps tools.
Key Methods: Knowledge Management | Critical Thinking | Research | Content Curation | Information Architecture
A curated knowledge hub demonstrating systematic approach to data analysis, reflecting expertise in structuring complex information and evaluating technical content.
- Systematized 500+ resources into logical learning paths
- Implemented quality control - Selected materials based on accuracy and relevance
- Optimized for usability - Structured content for quick navigation
- Enhanced user experience - Developed a web version to facilitate easy access and navigation
- Synthesized fragmented knowledge into unified framework
- Covered full analytics pipeline from fundamentals to deployment
Stack: Python | ClickHouse | Apache Airflow | Superset | Yandex DataLens | StatsModels | Uber Orbit | Telegram API
Key Methods: A/B Testing | Time Series Forecasting | Anomaly Detection | Cohort Analysis | ETL Pipelines | Dashboard Design
Building analytics process for startup: infrastructure, dashboards, A/B testing, forecasting, automated reports, and anomaly detection.
- Built complete data infrastructure from raw events to automated business intelligence
- Designed interactive dashboards for real-time monitoring of user engagement and retention
- Implemented rigorous A/B testing pipeline with statistical validation of feature experiments
- Developed forecasting models for server load prediction and capacity planning
- Created automated reporting system with daily Telegram delivery to stakeholders
- Established real-time anomaly detection for proactive issue resolution
- Enabled data-driven product decisions through comprehensive analytics ecosystem
Stack: Python | Pandas | Plotly | Tableau | StatsModels | SciPy | NLTK | TextBlob | Sklearn | Pingouin
Key Methods: Time-Series | Anomaly Detection | Custom Metrics | RFM/Cohorts | NLP | Clustering
Comprehensive analysis of Brazilian e-commerce data, uncovering key insights and actionable business recommendations.
- Time-series analysis of sales dynamics, seasonality, and trend decomposition
- Anomaly detection in orders, payments, and delivery times
- Customer profiling (RFM segmentation, clustering, geo-analysis)
- Cohort analysis to track retention and lifetime value (LTV)
- NLP processing of customer reviews (sentiment analysis)
- Hypothesis validation involved conducting tests to verify data-driven assumptions.
- Delivered strategic, data-backed recommendations to optimize logistics, enhance customer retention strategies, and drive sales growth.
Stack: Python | PostgreSQL | Airflow | Yandex DataLens | SQLAlchemy | DBLink
Key Methods: ETL Pipelines | Star Schema | Data Warehousing | Dashboard Design | SQL Optimization
End-to-end data pipeline and business intelligence solution for global distributor Wide World Importers.
- Built automated ETL pipeline transforming OLTP data into optimized star schema data mart
- Designed and implemented interactive dashboard for sales, logistics, and customer analytics
- Developed daily automated data updates with Airflow DAG orchestration
- Enabled data-driven decision making across sales, procurement, and logistics departments
- Reduced manual reporting time through automated data consolidation and visualization
Stack: Python | Pandas | NumPy | SciPy | Plotly | Statsmodels | Scikit-learn | Pingouin | TextBlob | Sphinx
Key Methods: Data Exploration | Statistical Testing | Cohort Analysis | Automated Visualization | Feature Analysis | Machine Learning
Powerful pandas extension that enhances DataFrames with production-ready analytics while maintaining native functionality.
- Seamlessly integrates exploratory analysis, statistical testing and visualization into pandas workflows
- Provides instant insights through automated data profiling and quality checks
- Enables cohort analysis with flexible periodization and metric customization
- Offers built-in statistical methods (bootstrap, effect sizes, group comparisons)
- Generates interactive visualizations with single-command access
- Supports both DataFrame-level and column-specific analysis
- Modular architecture allows extending with domain-specific methods
- Preserves all native pandas functionality for backward compatibility



