about me

I do Data Science, ML, and Software Engineering
and I look for ways to use technology for social good

education
MSc at the University of Porto (FEUP), in 2020.
Spent a semester at the University of Edinburgh as a graduate student,
and another at the University of Toronto developing my MSc thesis:
"High-level Approaches to Detect Malicious Political Activity on Twitter".

work & more
Creator of Desarquivo and Twitter Watch.

Worked at Wise, LIAAD, I3S, SPECS.

Previous speaker at QCon London, Future of Computing, Pixelscamp, WiE ILS,
IBM Qiskit Camp.

projects & tools

Here are some public projects that I have developed overtime, new ones every now and then

Desarquivo

A project that seeks to democratize and complement investigative journalism and fact-checking.
Arquivo.pt for justice, journalism and truth.

NLP NER Graph Analysis Docker Neo4j MongoDB

Twitter Watch

Highly configurable Twitter data collection framework.

Dataset creation Docker MongoDB

Hackacity Porto 2019

Improve decision making over the local housing problem by using: scraping of booking.com, scripts to gather Porto's open data, cleanup and dataset building, regression techniques to infer which features are more important, clustering, ...

Open Data Regression Clustering Hackathon

Big Data & Cloud Computing

Two-part project on Big Data and Cloud Computing. Focusing on handling large amounts of data, performing analysis and Machine Learning tasks.

Tools: Spark | GCP | DASK | dask-ml
Big Data Map-Reduce Cloud Computing Jaccard Index tf-idf

NexToI

NexToI (Next To Implement) is a cross-platform Ionic mobile application for people to sort their personal projects according to a set of categories, to chose which one to pursue next.

Tools: Ionic | Angular
cross-platform Play Store

πŸ”¬bioseq

Coverage Status

A python module for biological sequencing, includes most operations on DNA/RNA/Protein sequences, local and global alignment, BLAST, multiple sequence alignment, UPGMA, phylogenetic trees, similarity graphs, ... Developed for the Algorithms for Bioinformatics course. 100% code coverage.

Tools: biopython | NetworkX
Hierchical clustering Bioinformatics CI

alloy4fun logo Alloy4fun

Online tool for testing, sharing and learning the Alloy language. REST API in Java, front and backend for UI in meteor.js, database in MongoDB. Everything isolated in docker containers.

Tools: Docker | Meteor | MongoDB
Alloy

STUNS (Structring The UNStructured)

Command line tool to read raw datafiles collected from mobile phone sensors, report data quality metrics via HTLM reports, insert organised information into NoSQL Database. Expose new queries through flask REST API. Developed for Fraunhofer

Agents and Distributed AI

Agents Being Agents in a Maze and Agents Being Scrutinised for Being Agents. Multi-agent system for maze solving using different techniques. Followed by analysis of agents' behaviour.

Tools: Jade | Repast | RapidMiner
Multi-agent simulation Clustering Prediction

🐍 Pyhparser

Pyhparser is a Python module that parses input files input by describing the format the data is in, and allowing you to use them as your own variables. Due to its simplicity and speed it is great for Hackathons, Machine Learning and day-to-day data tasks.
I built it to save time in Google Hash Code.

DSL Hackathon

🐸 feup-iart

Artificial Intelligence project
Muticlass classification of Frog Species using Deep Neural Networks in TensorFlow, based on their calls πŸ”‰
Check out the Jupyter Notebook with the source code.
The data is available on a UCI dataset.

DNN Supervised Leaening Cross-validation Hyperparameter Tuning Stratified K-Fold Evaluation Confusion Matrix

βš› Teach Me Quantum

10 week Practical Course on Quantum Information Science and Quantum Computing- with Qiskit and IBMQX. Covered basics of Quantum Mechanics, Quantum Circuit Model, common algorithms (Deutsch, Grover, Shor), ...

Quantum Computing Quantum Circuits Quantum Algorithms

πŸ’½ LAT - Lara Autotuning Tool

  • Developed at SPeCS at FEUP.
  • LAT is a tool for code autotuning, built using LARA (a Javascript based language that supports source code transformations) and applied using the Clava tool.
  • LAT mimics Intel Software Autotuning Tool (ISAT) behaviour for testing multiple instances of the same code. However, it is build entirely on LARA. Because of this it is platform agnostic and can be more flexible and easier to expand, both in functionalities and languages (not only C and C++).
Tools: cmake | clava | lara
Intel ISAT Autotuning Research

πŸ“† SigTools

Sigarra Tools - Sigarra on Steroids: Productivity Tool to Export Calendar Events, Infinite Scroll, DataTables, and more (SigToCa's heir). Available on the Google Web Store, Firefox Add-ons or even direct import from the .crx.

api extension scrapping

πŸ”Ί feup-plog

Logic Programming Course, consisting of two projects:
  • Project 1: Prolog implementation of the LYNGK board game (logic and command line interface) - report (pt)
  • Project 2: Constraint Satisfaction Problem (clpfd) consisting of a "Constraint Logic Programming Approach to Teacher Hour Allocation for University Subjects" - paper
  • Extra 1: Includes code for Logic Programming exercises as well as code for 20+ exam solutions
  • Extra 2: Python was used to develop a tool to automatically generate, parse and test mathematically valid problem instances.
CSP Logic parallel board game

🚚 feup-cal

Algorithm Design and Analysis
Garbage Truck Route Generation using real OpenStreetMaps data and Google Maps javascript API. Implementation of many graph algorithms by taking advantage of many C/C++ features like: Classes, Templates, Operator Overrides, ...

Graph Floyd-Warshall Nearest Neighbour OSM

Pocket Reading Time Estimation Visualization

Calculate the amount of time each Pocket Article will take reading by adding a tag like 10 min.

api article metadata script

feup-ltw

Web Technologies Laboratory
A TODO list management platform, the JavaScript interface, including AJAX and events was done through pure JavaScript (intentionally). I actually built a primitive type of Query Builder inspired by Laravel's Queries.

AJAX UML mobile XSS CSRF

matrix Matrix - Bank Security Matrix Card Management

Real World Android Application to manage Bank Security Matrices: Minimalist, efficient, data is encrypted and the app is password protected. This allows people to stop carrying paper copies of their security matrices.
Android Play Store Encryption

feup-laig

Graphical Applications Laboratory
  • Design and application of a Graphics Primitives parser (XML-like language)
  • LYNGK board game visual interface using a Prolog server (from the feup-plog project) for the game engine.

Tools: WebGL | WebCGF | mongoose
3D shaders

feup-lbaw

Database and Web Applications Laboratory
A Community News Platform - VECTO. Using prepared statements, triggers, Ajax, Laravel, Docker, ...

AJAX UML mobile

feup-sdis

Distributed Systems
Distributed P2P File Backup System. Desgin and implementation of a new Peer Clustering Algorithm (see full description)
OSI FTP

feup-cgra

Computer Graphics
A WebGL (through WebCGF) project consisting of a user-controlable submarine.

3D keyboard

feup-rcom

Computer Networks
  • Implementation of a Communication Protocol (Data-Link Layer - 2 in OSI Model)
  • Implementation of an FTP client
  • Full Network Configuration with Router and Switch
OSI FTP

feup-sope

Operating Systems
A tool that mimicks Unix's find along with a Sauna Client Generation and Management system. Interesting Features:
  • Signal and Signal Handling
  • Multiprocessing and Multithreading
  • Command line arguments
  • Pipes, FIFOs, Mutexes, semaphores
OS Concurrency Linux

feup-aeda

Algorithms and Data Structures Project
Bookings Management System - a proof of concept implemented in C++. Using BSTs and HashTables.
Tools: doxygen
BST HashTable

feup-prog

C/C++ Introductory Programming Course
A Store Client and Stock Management tool with command line interface.
CLI

feup-lcom

Computer Laboratory
A text editor implemented in the MINIX OS, Device Drivers and all that jazz.
MINIX