All posts by admin

New event: “Barcelona: Hub for Advanced Analytics and Big Data”

 

We are pleased to announce our next event “Barcelona: Hub for Advanced Analytics and Big Data” on January 21st 19h at Movistar Centre. Doors will open at 18:45. You can register here.

We will have two great speakers in our panel: Josep Maria Martorell (Associate Director at the Barcelona Supercomputing Center) and Òscar Sala (mVentures Director at the Mobile World Capital Barcelona organization). Both will share their views on Barcelona and its potential to become a European Hub for Advanced Analytics and Big Data. You can see their bios below.

Josep Maria Martorell is Associate Director at the Barcelona Supercomputing Center, Spain’s leading supercomputing centre, specialized in High Performance Computing. Josep Maria gathers a rich experience in technology and research in government, education and the private sector. Among other positions, he was Director of Research for the Catalan Government, Head of Research at Universitat Ramon Llull and is a shareholder and advisor in multiple technological startups in Barcelona.

Òscar Sala is the mVentures Director at the Mobile World Capital Barcelona organization, a venture builder program that addresses the challenge of transforming scientific knowledge into technological solutions. In the past, Òscar held multiple positions related to technology and innovation at Caixabank, VP of Product Strategy at Strands (a successful local fintech), and member of the board at Mobey Forum, a global industry association empowering banks and other financial institutions to lead the future of digital services.

This event could not be possible without the collaboration of Movistar Centre.




Data & Ethics, summary of our last event

In the last months we have seen that Ethics has emerged as an extremely sensitive topic for Data and Analytics community. Most likely, one of the main drivers of this wave of concern was Facebook scandal: Mark Zuckerberg (founder and CEO of Facebook) had to testify in front of US Congress about how his company handles its users’ data and how this could have influenced results in recent elections in several countries. But Facebook is not the only company whose practices are under scrutiny. Tones of questions have also been raised regarding how much personal data Google collects and how this is being used: according to Guillaume Chaslot (an ex-Google engineer), the Youtube algorithm “does not appear to be optimising for what is truthful, or balanced, or healthy for democracy”.

In other words, we are talking not only about privacy but also on how data could even threaten our political system. As Cathy O’Neil writes in her must-read book Weapons of math destruction, “the math-powered applications powering the data economy were based on choices made by fallible human beings. Some of these choices were no doubt made with the best intentions. Nevertheless, many of the models encoded human prejudice, misunderstanding and bias into the software systems that increasingly managed our lives. Like gods, these mathematical models are opaque (…) Their verdicts, even wrong or harmful, were beyond dispute or appeal. And they tended to punish the poor and the oppressed in our society, while making the rich richer”.

As Data-Driven professionals we cannot ignore this inconvenient truth and must address it. This is one of the reasons we at BcnAnalytics organised a session to discuss about Data & Ethics. As speakers we had Carlos Castillo (Distinguished Research Professor at Universitat Pompeu Fabra) and Gemma Galdon (Founder at Eticas Research & Consulting and Researcher at Universitat de Barcelona).

Carlos focused his talk on algorithmic discrimination. He initially reviewed the concept of discrimination from a philosophical perspective and then explained the concept of group discrimination, which means “disadvantageous treatment to an individual because he or she belongs to a specific socially salient group”. According to Carlos a further step is statistical discrimination which can be observed “when group discrimination happens because of some statistical belief, which means that someone has certain data, has looked at this data and based on statistics extracted from this data has decided to treat someone worse than another person”. After reviewing these concepts, Carlos raised the key issue: machine learning algorithms can discriminate.

Why is that? Machine learning systems take data and extract statistical beliefs from this data and therefore they are enabled to discriminate some individuals, regardless of intention and animosity. The key aspect is the consequences of this algorithm in terms of treating worst a person because he or she belongs to a group. Carlos emphasized that to avoid this discrimination, models need to optimize not only accuracy but also need to look at “the risk of two different populations of not getting the same outcome”. Carlos also highlighted how important is that systems are transparent: “if you get a negative outcome, you have to have a way to challenge this decision in a way that is effective… If I am denied a loan or parole, I need to have a way of effectively challenge the decision to say the systems was wrong in my case”.

Gemma started her talk quoting “The Fall of Public Man” from Richard Sennett. “In a city full of sensors and cameras and surveillance everywhere, where would Romeo and Juliet fall in love?”. From Gemma’s perspective, technology is changing our lives and we really need to ask ourselves: Why we are investing in technology? What kind of societies are these technologies creating or promoting? Are we building the cities that we want to build? Do we want to live in a world where everything is remembered? Do we want to live in a world where we can never forget? As she mentioned: “for the first time in history, forgetting is more expensive than remembering. Everything we do is recorded by a camera or a sensor”. Gemma, then, started to review real cases on non-expected outcomes of certain technologies. For instance, smart borders based on biometrics. They were not part of the legislative debate because they were seen “as technical amendments”, but currently biometrics have become our IDs, and certain individuals self-mutilate when they want to hide their identities. In other words, their bodies became their enemies.

Gemma asked herself: “How can we hide behind a technical amendment? And what about false positives?  There is no redress mechanism”. According to her the most burning issue is we, as society, did not think technology could fail. But it fails. And this triggers the key issue: the way we do technology is very irresponsible and no one is facing the consequences of their actions, the consequences of their false positives…which might be human rights. Gemma ended her speech highlighting the fact we need to start thinking how technology is impacting our civilization: “we have the responsibility to decide how we build a social-technical infrastructure that is responsible and desirable for our generation and the next generations”.

See below the link with full session




New team members

BcnAnalytics family keeps growing. In the last weeks two new people joined our core team.

First one to join BcnAnalytics was Didac Fortuny. He is data scientist at Holaluz, a company that connects people to green power. In his own words: “I have a PhD in Physics in which I used data analytics to study the impact of climate change to Mediterranean precipitation. I also teach in a MSc in renewable energy and energy sustainability”.

Last addition has been Alejandra Manrique, She has more than 20 years of expertise in data analytics helping companies to get the most value out of data. In her own words: “I have worked in multiple sectors, water and enviroment, telecommunications, retail, automotive and media. I have international experience in different countries in Europe, America and Australia”.

Let us welcome Didac and Alejandra.




New Data-Driven events in Barcelona

The Barcelona GSE Data Science Center coordinates and promotes interdisciplinary and methodological research, training, and knowledge transfer in Data Science. They are now organising some academic seminars and conferences. See below their upcoming events for the month of March.

In the field of causality, we want to understand how a system reacts under interventions. These questions go beyond statistical dependences and can therefore not be answered by standard regression or classification techniques. In this tutorial, you will be introduced to the interesting problem of causal inference as well as recent developments in the field. We will introduce structural causal models, formalize interventional distributions, and define causal effects as well as show how to compute them. We will present three ideas that can be used to infer causal structure from data: (1) finding (conditional) independences in the data, (2) restricting structural equation models and (3) exploiting the fact that causal models remain invariant in different environments. If time allows, we will also show how causal concepts could be used in more classical machine learning problems. No prior knowledge about causality is required. The material is also covered in a recently published book (open access).

The course will offer an introduction to deep learning along with an extensive practical hands-on session in Python. We will cover deep feedforward models, convolutional networks used mainly in image processing, recurrent neural networks used commonly in text processing, autoencoders, word2vec, as well as introduce optimization for deep learning. During the hands-on workshop, we will use deep learning techniques on images and natural-language text.

Bayes Comp is a biennial conference sponsored by the ISBA section of the same name. The conference and the section both aim to promote original research into Bayesian computational methods for inference and decision making and to encourage the use of frontier computational tools among practitioners, the development of adapted software, languages, platforms, and dedicated machines, and to translate and disseminate methods developed in other disciplines among statisticians.



Do not miss our next event: Data & Ethics

In BcnAnalytics we are really passionate about Data. At the same time, we also have some concerns about ethical aspects of a data-driven world. So, we are pleased to announce our next event will focus on “Data and Ethics”.

Event will be on April 11th 19h at MWC, and as usual doors will open at 18:45.

We will have two great speakers in our panel: Carlos Castillo (Distinguished Research Professor at Universitat Pompeu Fabra) and Gemma Galdon (Founder at Eticas Research & Consulting and Researcher at Universitat de Barcelona). Both will share their views on ethical aspects when using data and building algorithms. They will raise concerns around bias, discrimination and opacity in a data-driven world and how this might negatively affect certain people on their lives.

As usual, after the talks we will have time for networking and free cold beers.

If you want to attend, you can register here

This event could not be possible without the collaboration of Movistar Centre.




THE BCN AIR QUALITY DATATHON CHRONICLES

On Sunday January 21st, at about 14:00, the winners of the BCN Air Quality Datathon were announced by the jury. This scene concluded an intense weekend in which 12 teams formed by data scientists with all kinds of backgrounds and coming from different countries worked hard to achieve a clear goal: use data to improve the air quality predictions that the Barcelona Supercomputer Center (BSC) performs with the CALIOPE system.

It all began on Saturday 20th at 9:00, when the first participants arrived and collected the wonderful green t-shirt with the motto “Keep modelling and mind the air quality”. Then, after the kind words of our host Vicenç Villatoro (the director of CCCB), Janet Sanz Deputy (mayor for Ecology, Urbanism and Mobility #Barcelona), and people from the companies that made the event possible (the sponsors Gauss&Neumann, Social Point and Holaluz), the datathon was presented and the challenge made public to the participants.

Given the concentration of NO2 observed hourly in 7 measurement stations, and hourly predictions of the concentration of NO2 performed every day with the CALIOPE system, the challenge was to find the model that best predicted the probability for a set of days in 2015 to exceed a threshold concentration of 100 µg/m3 at least in 1 hour of the day.

After that, the teams had about 24 hours to design and implement their models and submit their predictions. At that moment, the strategies of the different teams started to emerge. Some discussed how to build the model before implementing it, while others started coding straight away to make the most with the available time. While experienced teams used a rigorous methodology to work in parallel at a fast pace, some newbies struggled to find a way to combine different languages or pass data from one computer to another. All of this in an environment of concentration but also of relaxation.

After a night in which some participants (and some organizers) did not sleep much, the predictions were finally submitted on Sunday morning. It was the turn for the teams to describe their work in 4-minute presentations in front of a jury formed by Carlos Pérez García-Pando, Kim Serradell and Maria Teresa Pay from BSC, Marc Torrent from the Big Data Center of Excellence, Salvador Lladó from Leitat, and Manuel Bruscas and Didac Fortuny from BcnAnalytics.

Two awards were given: The accuracy award, which was given to the team with more precise predictions, consisted on 2000 € and a pass for the Mobile World Congress 2018 for each member of the team. The winning team was “Worthless Without Coffee”, who performed a time series prediction using concentration values of the previous days, predictions of the CALIOPE system, concentration increases, some calendar variables and the characteristics of the measurement stations. They have kindly agreed to share their code, which can be found following this link.

The creativity award took into account the originality in facing the challenge and the insights found within the data. The winners of this award were the team “Dreamers”, who proposed some appealing policies to improve the air quality, and the team “Alpha”, who made useful suggestions to the members of the BSC to improve their predictions based on what they observed within the data. Each team won 600€ and passes for the 4 Years From Now 2018 event.

The datathon is over but there is still room to improve air quality predictions. For this reason, the data set will be kept public and any restless data scientist will be able to access it and keep working on the problem. Following this link anyone can download the data and the documentation given in the datathon. So, data scientists, keep modelling and mind the air quality!




Datathon about pollution in Barcelona

 

A few years ago, when we created Bcn Analytics our vision was Barcelona can become a European analytics hub. Our ambition was to foster that different members of community (business, academia, data professionals) could meet and share experiences and knowledge. Now, 3 years after, we feel proud of we accomplished. We have organised 10 meet-ups where fantastic speakers from great organisations have shared their expertise: we had guests from Google, New York University, King.com, La Caixa, Telefonica, Schibsted, Social Point, BBVA, IPSOS or Vistaprint, among others. We also had the chance to organise two Datathons with Social Point so data scientists could compete to win some prizes while having fun with data.

But when we created BcnAnalytics we also wanted to contribute to make Barcelona a better city. We truly believe data can also be for good. So, we are proud to announce we are co-organising the first “Barcelona Pollution Datathon”. What is the Datathon about? First let me share some data points. According to this article 95% of people in Barcelona are breathing more particle pollution than is recommended by World Health Organisation. If you think about, this is scary. In fact air pollution in Barcelona rises by 48% on public transport strike days, study reveals. We thought we could do something to raise awareness on pollution levels and also get data scientists involved. And we decided to co-organise an event with CCCB, BSC, Leitat and Eurecat. That was the spurn of this first “Barcelona Pollution Datathon”.

The Datathon is going to be part of the exhibition “After the End of the World” which is being organised by CCCBB. Participants of the datathon will have to build a prediction model on Barcelona pollution levels. We have more than 3.000€ in prizes thanks to our sponsors Social Point, Holaluz and Gauss & Neumann. We also have the support of Mobile World Congress.

Datathon will be on January 20th and 21st and registration is now open through this link https://docs.google.com/forms/d/e/1FAIpQLSf6cMswcosinjbC6-VS13Ih4fIYUPaC3LYV5VcOGxLIiK8-IQ/viewform .

Do not miss it and let people know they can make a difference and also have access to great prizes.

Note: this blog entry has been written by Manuel Bruscas, co-founder of BcnAnalytics.  The opinions expressed in this article are the author’s own




Next event: Machine Learning series III

We are pleased to anounce our next event “How start-ups are using Machine Learning to disrupt industries”.

We will have two great speakers in our panel: LongLong Yu (Co-Founder & Head of Research at Wide Eyes) and Aleix Ruiz de Vila (Chief Data Officer at Onna). Both will share the learnings and insights on how their companies are disrupting some industries applying various machine learning techniques.

Bio LongLong Yu: he received the MSc degree in computer vision and artificial intelligence from the Autonomous University of Barcelona (UAB) in 2013. In the same year he co-founded the artificial intelligence and image recognition company Wide Eyes Technologies (Wide Eyes). His career as computer vision and machine learning geek started with human detection for surveillance and face recognition for biometric analysis. Currently, he is Member of the Board and Head of Research and Innovation at Wide Eyes and focuses mainly on image classification, retrieval and object detection for the fashion industry.

Bio Aleix Ruiz de Vila: PhD in mathematics, has applied machine in areas such as transportation, journalism and retail. Currently as Chief Data Science at Onna is responsible for developing and putting in production machine learning models for documents management. Cofounder of the Barcelona R Users Group and Barcelona Machine Learning Study Group meetups, also collaborated with BcnAnalytics

REGISTER HERE

WHEN
Monday, October 30th, 2017 at 19:00 PM (18:45 registration opens)

WHERE
Mobile World Centre – fontanella 2, 08002 Barcelona




Python Hack 17

Are you coming to this year’s PyConEs? Don’t lose the chance to participate in the PythonHack that is organising Kernel Analytics!

We offer three different challenges:
Accuracy Contest: if you want to prove that your models can beat the rest, this is your contest.

Web App solution: for back-end and front-end developers that can give an operative and fancy solution to an open question

Happy Hour challenge: in-person machine learning challenge. Participants will have free drinks during the challenge!
In order for you to assist to all conferences, Kernel is releasing the datasets of the Accuracy Contest and Web App solution one week before the PyConEs begins.

Happy Hour will also take place after Saturday’s conferences end.

Choose which contest you want to participate in and register as soon as you can because there are limited spots.

Here is our website, with detailed information and the registration form: https://hackathon.kernel-analytics.com/#/

Start working soon so you can be one of the fortunate to win one of these prizes 😉

prizes

For any question e-mail us to hackathon@kernel-analytics.com
Hope seeing you in Cáceres!