I'm a Director Data Science at Expedia Affiliate Network (Expedia, Inc). Previously I was a Data Scientist at Feedzai, where I performed R&D in big data and machine learning, applied to large scale credit card fraud detection. I was a Big Data Intern at Siemens Research in Princeton, NJ, where I implemented efficient search algorithms using Hadoop. I obtained a PhD in Machine Learning and Computer Science from the University of Minho, where I researched scalable machine learning algorithms for temporal data. Before, I was with Nokia Siemens as an R&D software engineer. I graduated in Computer Science and Systems Engineering in 2006.
Director Data Science in the Analytics team
Running the data science team.View website
Sr Data Scientist in the Analytics team
Sr Data Scientist in the Analytics team of the world largest travel company.View website
Data Scientist in a fraud prevention solution
Research and development in a large scale credit card fraud prevention solution, which processes 2B credit card transactions a year. I have been working as a data scientist in the fraud detection classification tools. I have also led the development of a REST API web service for helping online merchants detect fraud in their payments.View website
Researcher in search and indexing techniques in big data
Developed efficient search techniques in big data using Hadoop. The goal was to retrieve the Top-K nearest neighbors to the query sequence, for N queries at the same time. I also implemented a state of the art index for very fast approximate search.View website
PhD in Machine Learning and Computer Science
Research and Development of highly scalable pattern discovery algorithms for Terabyte sized disk-based or streaming data, and statistical evaluation measures for pattern discovery algorithms, published in top-tier conferences. One of the approaches won the Google-sponsored best student paper award and was also published in a journal.View website
Research and Development in a telecommunication networks analysis product
Research and Development in a leading telecommunications network management and analysis product (SPOTS). The SPOTS is implemented in more than 90 countries at top telecommunication mobile operators (e.g. Vodafone,T-Mobile). After just 1 year, I was leading the online monitoring subsystem. This subsystem monitors thousands of network objects properties simultaneously, triggering alarm events in case anomalies are detected. I was also responsible for the product’s System Monitoring tool, and performed the research, analysis, and specification of Adaptive Thresholding features for the Real Time subsystem. Main technical skills covered: Java, C++, ClearCase, unix shell scripting, machine learning.http://www.nsn.com
Computer Science and Systems Engineering degree
During my 5 year undergraduate degree at the University of Minho I became well versed in programming in Java/C/C++/Perl/VB/SQL/PHP/HTML/Haskell/Prolog, artificial intelligence, machine learning, statistics, databases, object oriented programming, UML, software engineering, web programming, networks, data structures and algorithms, computation theory, cryptography, GUI design, XML/XPath/XSL, distributed programming, operating systems and computer architectures. My efforts were recognized by winning two university merit awards in 2003 and 2004. I spent a semester abroad at the Utrecht University (The Netherlands), where I attended courses and performed projects supervised by Doaitse Swierstra. I spent my last semester performing research in event forecasting at Siemens as an intern, where my project achieved a grade of 19/20.http://www.di.uminho.pt
Nuno Constantino Castro and Paulo J. Azevedo
Extends the co-winner of the Google sponsored best student paper award from SDM’11. This paper gives a method to evaluate statistical signiﬁcance of discovered patterns in time series, enabling ranking and ﬁltering of the often large number of patterns discovered by time series data mining techniques. In addition to additional details on the algorithms, this paper includes additional results.View website
Nuno Constantino Castro and Paulo J. Azevedo
An approach for assessing (for the first time in the literature) the statistical significance of time series patterns. Statistical significance tests are used to assess each pattern’s p-value. [Best Student Paper].View Website
Multiresolution Motif Discovery in Time Series
A highly efficient algorithm for pattern discovery in time series data. The algorithm finds all patterns in the database in linear time: uses one single sequential scan over the database; and allows adjusting the amount of memory to use using a clever space saving approach.View website
When I'm not hacking something, you'll often find me doing some other cool stuff.
Feel free to drop me a line.
I currently live in London, UK.