Bruno Galvão

Senior Data Scientist

Transforming raw data into actionable insights and scalable solutions. Over a decade of experience combining scientific rigor with real-world business impact.

About Me

Bruno Galvão

Senior Data Scientist with Passion for Innovation

I am a Senior Data Scientist with over a decade of experience transforming raw data into actionable insights and scalable solutions. My journey blends academic research and industry expertise, bridging the gap between rigorous scientific thinking and real-world business impact.

With a strong foundation in Mathematics, Statistics, and Physics, I specialize in statistical modeling, machine learning, causal inference, and MLOps. Currently researching Generative AI and Robotics applied to Math and Science education.

Key Achievements:

  • Reduced unproductive technical visits by 43% through ML models for predictive analysis
  • Authored 130+ technical reports, driving 35% improvement in operational efficiency
  • Led and mentored junior data scientists, fostering a culture of data-driven decision-making
  • Designed and implemented experimentation frameworks for product and process optimization

Technologies & Tools

Python
Java
JavaScript
C
C++
HTML5
CSS3
PHP
Pandas
NumPy
SciPy
Scikit-Learn
Matplotlib
Seaborn
Plotly
TensorFlow
Keras
PySpark
LangChain
AWS
Databricks
Azure DevOps
Jira
Docker
Delta Lake
PostgreSQL
MySQL
SQL Server
Power BI
VS Code
Jupyter
Colab
Excel
PowerPoint
Office
Markdown
Mermaid

Featured Projects

MIGUEL - Chatbot didático

Educational Chatbot based on RAG

An educational chatbot based on RAG (Retrieval-Augmented Generation) built with LangChain for intelligent document processing and learning assistance. Features vector databases, semantic search, and LLM integration for enhanced information retrieval.
Python LangChain OpenAI Vector DB Streamlit RAG

GDP Prediction Model

Economic forecasting with machine learning

Advanced machine learning model for predicting Gross Domestic Product (GDP) trends using economic indicators. Features interactive web interface for scenario analysis and economic forecasting with statistical validation.
Python Scikit-learn Pandas Time Series Streamlit Economic Analysis

Trading Platform

Financial data analysis and trading interface

Comprehensive trading platform with real-time financial data analysis, portfolio management, and algorithmic trading strategies. Features interactive charts, risk assessment, and performance analytics for informed investment decisions.
Python Financial Analysis Plotly Streamlit API Integration Portfolio Management

Life Expectancy Prediction

ML model for health and demographic analysis

Machine learning model that predicts life expectancy based on various health, economic, and social factors. Uses WHO data to analyze relationships between lifestyle, healthcare quality, and longevity across different countries.
Python Scikit-learn Pandas Seaborn WHO Data Health Analytics

Exploratory Data Analysis

Comprehensive data exploration techniques

Collection of advanced exploratory data analysis techniques and methodologies. Demonstrates statistical analysis, data visualization best practices, and insights discovery from complex datasets using Python's data science ecosystem.
Python Pandas Matplotlib Seaborn Statistical Analysis Data Visualization

E-commerce Data Analysis

Business intelligence for retail insights

Comprehensive analysis of e-commerce data focusing on customer behavior, sales patterns, and business performance metrics. Includes customer segmentation, sales forecasting, and actionable business recommendations.
Python Pandas Customer Analytics Business Intelligence Sales Analysis Visualization

Professional Summary

I'm a Data Scientist with over 10 years of experience at BB Tecnologia e Serviços, where I apply data science to solve complex business problems and drive operational excellence.

My expertise spans data visualization, predictive modeling, causal inference, NLP, deep learning, computer vision, pipeline automation, MLOps, large language models (LLMs), and artificial intelligence.

Over the last year, I led a project that reduced unproductive team visits by 43% through machine learning, causal inference, and demand prediction. Additionally, I published over 130 technical reports employing data science techniques to identify bottlenecks and maximize team performance. My work also includes developing analytical dossiers from extensive data and business rules — delivering a 35% improvement in operational efficiency and earning several corporate accolades.

I'm proficient in Python, Java, Javascript, C, C++, SQL, Git, DAX, Markdown, HTML, PHP, and CSS, and I frequently leverage libraries and frameworks such as Scikit-Learn, TensorFlow, Keras, Seaborn, Matplotlib, Streamlit, Plotly, Pandas, NumpY, SciPy, Dash, and more.

I'm also skilled in a range of tools and platforms, including VSCode, Power BI, Jupyter Notebook, Google Colaboratory, SSIS, AWS, Azure, Databricks, BigQuery, Jira, Trello, Azure DevOps, Power Apps, Power Automate, Excel, MySQL, PostgreSQL, SQLite, and SQLServer.

Currently, I'm pursuing a Master's degree in Physics with a concentration in Data Scientist, Generative AI, and Robotics for Math and science education. I hold an MBA in Data Scientist, Governance, and IT Management, and a degree in Physics from IFCE (Federal Institute of Ceará), where I researched computer-assisted education and peer strunction.

I bring extensive experience in Information Technology, predominantly with Banco do Brasil — the largest bank in Brazil — delivering innovative, data-informed solutions to aid its operations.

Areas of Expertise

Machine Learning

  • Supervised & Unsupervised Learning
  • Deep Learning & Neural Networks
  • Natural Language Processing (NLP)
  • Computer Vision
  • Recommendation Systems
  • Time Series Forecasting
  • Feature Engineering & Selection
  • Model Evaluation & Validation
  • Ensemble Methods
  • Transfer Learning

AI Engineering

  • Large Language Models (LLMs)
  • Retrieval-Augmented Generation (RAG)
  • Generative AI Applications
  • Model Deployment & Serving
  • AI Pipeline Architecture
  • Vector Databases
  • Prompt Engineering
  • AI Ethics & Bias Mitigation
  • Edge AI & Model Optimization
  • AI System Monitoring

Data Science

  • Statistical Modeling & Inference
  • Causal Inference & Experimentation
  • A/B Testing & Hypothesis Testing
  • Exploratory Data Analysis (EDA)
  • Data Visualization & Storytelling
  • Business Intelligence & Analytics
  • Predictive Analytics
  • Customer Segmentation
  • Survival Analysis
  • Uplift Modeling

Data Engineering

  • ETL/ELT Pipeline Development
  • Data Warehouse Architecture
  • Big Data Processing (Spark, Databricks)
  • Cloud Data Platforms (AWS, Azure)
  • Real-time Data Streaming
  • Data Quality & Governance
  • MLOps & Model Lifecycle Management
  • CI/CD for Data Pipelines
  • Database Design & Optimization
  • Data Lake & Delta Lake

Let's Connect

Get In Touch

I'm always open to discussing interesting opportunities, data science projects, or simply exchanging ideas about technology and innovation.

Brazil

Send a Message