Felipe González Data Scientist

Hi! welcome to Felipe homepage I'm a data scientist and data analyst specialist from Colombia.

Hi! I'm Felipe.

My job consists in helping companies and researchers to analyse, visualize, and apply domain knowledge to extract valuable insights and knowledge from their large and complex data sets. I am skilled for most data-science steps: data pre-processing, application of statistical techniques, machine learning model development and deployment,data visualization and results communication.

In a wide range of subject areas, I have analyzed structured and unstructured data to extract actionable business insights. I like to get involved in NLP and AI projects. I love to craft stunning and clever visualizations that illustrate surprising results.

Before I went into industry, I obtained a physics engineering degree from UTP. I'm currently looking for the aws machine learning certification. I love working with great people, getting creative with data and building awesome products. Interested in working together or having a chat? Feel free to contact me.


Data mining

I can identify patterns, trends, outliers, and correlations. Train models on historical data to make predictions or classifications on new data, and deploy them, so they can be integrated into real-world applications.

  • Exploratory Data Analysis of a dataset.
  • Creation of a reproducible reports using Jupyter notebooks.
  • Application of statistical methods on your dataset.
  • Building predictive models using machine learning algorithms.

Data Management

Data is often messy and incomplete and you need to harvest data and store it in an adequate format to be able to use it.

  • Harvest data from different sources like APIs, SQL databases or messy Excel spreadsheets.
  • Reformat and reshape data to make further analyses easier.
  • Clean datasets removing outliers, errors, missing values and inconsistencies.

Data Visualization

By using visual elements like charts, graphs, and maps, data visualization provide an accessible way to see and understand trends, outliers, and patterns in data.

  • Dashboard creation for real time data analysis.
  • Creation of custom and interactive data visualization.

Featured Projects


Software deffects detection with XGBoost

In this project, I have worked on a Kaggle binary classification contest from the 2023 edition of Kaggle's Playground Series. The main goal is predict defects in C programs given various attributes about the code.

Repository Notebook


Data Visualizations

Occasionally, I create visualizations of public data, e.g. maps, traffic data or election results. This includes some dashboards in Looker Studio

Flights search data Betting data


Feel free to contact me for any question. For open source projects, please open an issue or pull request on Github. If you want to follow my work, reach me on Twitter. Otherwise, send me an email at felipe.gonzalez.data@gmail.com.