Skip to content
View pedrororo's full-sized avatar

Block or report pedrororo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pedrororo/README.md

Pedro A. A. Silva  👨‍💻

Full-Stack Data Engineer • Multimodal ML Specialist
Juiz de Fora, MG — Brazil

LinkedInGitHubKaggleHugging Face


🚀 About Me

PhD-trained problem-solver who turns messy, high-volume data into secure, self-service products.
I build cloud-native lake-houses (AWS + Databricks), automate ETL with Python/Selenium bots, fine-tune YOLOv8 + BLIP on EN/PT FrameNet data, and ship Node/React dashboards on PostgreSQL—while holding 99.5% SLAs and delivering 80% manual-effort cuts & 35% cost savings.


🛠 Core Strengths

Area Impact Highlights
ETL Automation & Bots 10+ Python/Selenium templates (retry + alerts) → save 15h/week and trim ops cost 35%
Lake-House & Orchestration 20+ Databricks jobs/day on Delta Lake; dynamic scaling halves failure rates
Full-Stack Delivery AES-256 Node/TS APIs + React/Tailwind portals serving 40 analysts real-time data
Multimodal AI Fine-tuned YOLOv8 + BLIP → +12 BLEU / +9 mAP; models live on HF with Gradio demos
GDPR & Security S3 encryption, RBAC, audit trails; architecture green-lit by C-suite in one review
Agile Leadership 2-week Scrum sprints (Trello/ClickUp) → 20% faster releases; secured budget via ROI-driven roadmaps

🔧 Tech Stack

Python · PySpark · SQL · Node/TypeScript · React · PostgreSQL · Databricks · Delta Lake ·
AWS (S3, Lambda, IAM) · Docker · Terraform · YOLOv8 · BLIP · Hugging Face · Power BI · Tableau


💼 Professional Journey

Artemis Smart Data
Data Engineer (Full-Stack)
2024-Present
• Automated 20 + Databricks pipelines/day → 50% fewer failures
• Deployed secure Node/TS API + React dashboard for 40 analysts
• Cut AWS compute spend 35% via serverless re-design
FrameNet Brasil / UFJF
ML Researcher
2023-2024
• Fine-tuned YOLOv8 & BLIP on EN/PT FrameNet → +12 BLEU / +9 mAP
• Published checkpoints on Hugging Face with live Gradio interface
Vox Radar
Data Scientist
2021-2022
• Automated 70% of social-data reporting via Selenium/RPA bots

🎓 Academic Background

  • Ph.D. Computational Engineering – UFJF (2018)
    Thesis: Discretization-dependent models for excitable media (Phys. Rev. E)
  • *M.Sc. & B.Sc. Mathematics – UFJF (2013) · UFES (2011)

🌐 Languages

PT-BR (native) • EN (C1) • ES (B2)


📫 Let’s Connect

I’m always up for chatting about pipeline automation, lake-house strategy, or multimodal ML. Feel free to reach out!

Pinned Loading

  1. MonoAlg3D_C MonoAlg3D_C Public

    Forked from bergolho/MonoAlg3D_C

    C

  2. Droplets_2D Droplets_2D Public

    Droplets

    C++

  3. span-finder span-finder Public

    Forked from hiaoxui/span-finder

    Parse sentences by finding & labeling spans

    Python

  4. visual_genome_python_driver visual_genome_python_driver Public

    Forked from ranjaykrishna/visual_genome_python_driver

    A python wrapper for the Visual Genome API

    Jupyter Notebook

  5. data_garcia_marquez data_garcia_marquez Public

  6. insightdataintel/vox-radar-news-scrapping-server insightdataintel/vox-radar-news-scrapping-server Public

    1