Are you coming to this year’s PyConEs? Don’t lose the chance to participate in the PythonHack that is organising Kernel Analytics!
We offer three different challenges:
Accuracy Contest: if you want to prove that your models can beat the rest, this is your contest.
Web App solution: for back-end and front-end developers that can give an operative and fancy solution to an open question
Happy Hour challenge: in-person machine learning challenge. Participants will have free drinks during the challenge!
In order for you to assist to all conferences, Kernel is releasing the datasets of the Accuracy Contest and Web App solution one week before the PyConEs begins.
Happy Hour will also take place after Saturday’s conferences end.
Choose which contest you want to participate in and register as soon as you can because there are limited spots.
Check out the whole video of our last event with NYU professor Joan Bruna, assistant professor of the Courant Institute (NYU). Joan shows us some important applications of Deep Learning and which are the next challenges of this hot field.
We are pleased to anounce our next event and the first in our series: Machine Learning Series II. It is a great honour to have in our panel Joan Bruna, Assistant Professor at Courant Institute, NYU, in the Department of Computer Science.
Joan Bruna will share the learnings and insights of applying various machine learning techniques to a number of different use cases throughout his career, such as image or real-time video recognition, among others. The conference will combine an initial master class and a debate with the audience.
Wednesday, July 19th, 2017 at 19:00 PM (18:45 registration opens)
WHERE Aula Capella, ground floor, at Historical Building of Universitat de Barcelona, at Plaça Universitat, at Gran Via de les Corts Catalanes, 585
Bio. Joan Bruna is an Assistant Professor at Courant Institute, NYU, in the Department of Computer Science, Department of Mathematics (affiliated) and the Center for Data Science, since Fall 2016. He is currently on leave from UC Berkeley (Statistics Department).
His research interests touch several areas of Machine Learning, Signal Processing and High-Dimensional Statistics. In particular, in the past few years he has been working on Deep Convolutional Networks, studying some of its theoretical properties and applications to several Computer Vision tasks.
Before that, he worked at FAIR (Facebook AI Research) in New York, working on Unsupervised Learning. Prior to that, he was a postdoctoral researcher at Courant Institute, NYU, under the supervision of Prof. Yann Lecun.
Joan completed his PhD in 2013 at Ecole Polytechnique, France, under the supervision of Prof. Stephane Mallat. Before his PhD he was a Research Engineer at a semi-conductor company, developing real-time video processing algorithms. Even before that, he did a MSc at Ecole Normale Superieure de Cachan in Applied Mathematics (MVA) and his undergrad at UPC (Universitat Politècnica de Catalunya, Barcelona) in both Mathematics and Telecommunication Engineering.
Interestingly ML has been here for decades (ask Alan Turing!), but most of the time it seemed to be a slept giant facing a cold and long winter. Now it seems to be fully emerging and expectations are massive. In fact, and according to some experts, we are about to start a new era where machines will replace people in almost every single activity we currently consider “as human” (doctors, lawyers or teachers, your profession is also at risk!). As a result, ML is getting a lot of traction in the media and even an increasing number of philosophers and sociologists are discussing how AI and ML will change the role of human being as we know it. In that regards, I strongly recommend the book “Homo Deus: A Brief History of Tomorrow” by Yuval Noah Harari.
I certainly share the excitement about all the opportunities ML brings (and I do also share the concerns about ML impact in human beings, to be honest). However, I also see the risk of Machine Learning ends up becoming a new bubble. Like the “BIG Data hype” we had a few years ago. You might remember: companies were told they only needed to invest money in BIG DATA and then all their problems would go away. CEOs and top management believed in a brave new BIG DATA world where they only had to plug and play a new infrastructure. Reality has been very different and although BIG DATA has added tremendous value to some organisations, in many others it has only generated disappointment. In most of the cases big-data winners have been some smart vendors who sold the dream and have been squeezing the big-data orange very effectively. As a rule of thumb, I would say anyone using the expressions “Big Data”, “Machine Learning” and “almost real time” in the same sentence is probably a -sorry to say- “bullshiter”. (Note: if that person also adds some comments about an Attribution solution, you probably should watch your pocket!).
So, what is wrong with Machine Learning? Nothing, “per se”. But we must be realistic about what can be achieved with ML “in the near future” and what cannot be achieved (reality versus myths). Unfortunately, there is not a magic button (yet). From my perspective, setting the right expectations is always a good start: we need to be clear about the real potential ML has in our organisations in the next 12-18 months to ensure we allocate the right resources. If we are too bullish we will probably miss expectations and therefore generate frustration. If that happens Machine Learning can face (again) a cold winter.
Also, and given not all companies/sectors are equal, a ML strategic review is highly recommended before moving too quickly into execution: Which business areas could benefit more from ML? Do we really need ML in all the business areas? Which ML initiatives can bring value in the short term and which ones are a long-term investment? We should always bear in mind there are many great projects where you only need “old-school” Analytics / Data Scientists (yes, believe me, you can bring tones of insights without Machine Learning). Last, but not least, companies also need to ensure they appoint the right people to manage Machine Learning Initiatives. Successful organisations will have to create a middle layer between the technical teams and the pure business teams. I call these people “translators”.
In Bcnanalytics we truly believe Machine Learning is a crucial topic. That is why we are organising a series of events and talks to generate debates and discussions which will help the community to better shape this fascinating topic: understanding its potential but also raising its risks. Welcome to the Machine, and as Dave Bowman once said: “Hello, HAL. Do you read me, HAL?”
We are pleased to anounce our next event and the first in our series: Machine Learning Series I. In this talk we will know from 2 industry leaders how they are applying machine learning techniques to produce business results.
It is a great honour to have in our panel Alexandros Kartzoglou, Scientific Director at Telefonica Research, Jose A. Rodriguez-Serrano Data Science Team Lead at BBVA Data & Analytics
Alexandros Katzoglou is Scientific Director of Telefonica Research in Barcelona, Spain working on Machine Learning, Deep Learning, Recommender Systems, Information Retrieval. His team includes researchers in the areas of HCI, Networks and Systems. We are creating Machine Learning algorithms for customer data and data generated by Network operations. He received his PhD from the Vienna University of Technology and was a visiting fellow at the Statistical Machine Learning group at NICTA/ANU in Canberra, Australia. He frequently teaches Machine Learning courses at the UPF/GSE with some R, Python, Statistics, Recommender Systems and Deep Learning.
Jose A. Rodriguez-Serrano is currently a Data Science Team Lead at BBVA Data & Analytics since 2015. Formerly, he was Area Manager of the Machine Learning for Services group at Xerox Research Centre Europe, after being a permanent research scientist since 2010 at the Computer Vision Group. Previously he had been a postdoctoral fellow at the University of Leeds and Loughborough University, UK. He obtained his PhD in 2009 from the Universitat Autonoma de Barcelona and a degree in Physics in 2003 from the University of Barcelona. His main research topics have been image retrieval and learning to represent objects through embeddings of generative models and discriminative embeddings. He has published papers in IEEE Trans. PAMI, IJCV, CVPR or ICCV, among others and has 25 patents and patent applications. His main interest has been to apply state-of-the-art machine learning research to solve problems of industrial interest, and is fascinated by the challenge of making research outcomes simple to use for non-experts
We are pleased to anounce our next event: Extracting value from text. In this talk we will know from 3 industry leaders how they are using advanced text mining techniques to create value from unestructured data.
It is a great honour to have in our panel Hugo Zaragoza, CEO at Websays, RJ Friedlander, CEO at ReviewPro and Alberto Robles, CEO at Expert Systems Iberia.
Hugo Zaragoza is the founding CEO at Websays, a start-up pioneering online opinion analytics. His research is at the frontier of machine learning and information retrieval, always pushing the boundaries of search engines and text mining technologies. Before Websays he worked in industrial research for over 10 years both at Yahoo! Research, where he led the Natural Language Retrieval group, and Microsoft Research, where he helped create the MSN Search engine (now Bing). He is the author of 20 patents and more than 50 articles in renown international conferences (such as WWW or SIGIR). RJ Friedlander is the founding CEO at Review Pro, a leading provider in Guest Intelligence solutions to independent hotel brands worldwide, which includes both the Online Reputation Management and the Guest Survey Solution modules. The solution uses advanced text mining techniques to enable hoteliers to obtain deeper insight into operational and service strengths and weaknesses, increasing guest satisfaction, ranking on review sites and OTAs, and driving revenue. More than 17 years experience in Internet and technology in Europe, the US and Asia. Expertise both as an operating executive and entrepreneur. Led projects that have generated more than $300 million in revenue. Nearly 10 years as a senior executive in one of Spain’s largest media companies most recently overseeing the business unit that included the group’s digital media and eCommerce companies.
Alberto Robles is the CEO of Expert System Iberia and former Director of ICM (Intelligent Content Management) from iSOCO, a company acquired by Expert System in 2014. With a long professional experience in the technology sector, during his experience in Tecsidel contributed to the creation of a spin-off focused on the application of digital certificates in public administration, the launch of its division in Latin America and the creation of an independent business unit. At ICONUS (Comsa Emte Group company), his work helped increase profits and obtain better business efficiency ratios. Alberto holds a degree in Computer Engineering from the Polytechnic University of Catalonia (UPC).
BCNAnalytics and SocialPoint are pleased to invite you to the second Barcelona Gaming Data Hackathon. 40 participants. 10 teams. +€1.5k euros in prizes. 24 hours. Food, t-shirts, great views and awesome people. R, Python, ML, … Ready?
The competition will focus on analyzing the data about the Monster Legends Game (iOS and Android). Teams will participate in 2 different tracks: the accuracy track about building a recommendation system for monsters and the business insights track, about finding actionable insights from the given datasets.
9:30 Presentation by BCNAnalytics and SocialPoint. Team Formation
10:00 Competition start. Data sets are released and submission platform begins accepting submissions.
14:00 Lunch all together to a cool place
20:00 SocialPoint offices closes
Sunday 16th October
9:00 SocialPoint offices reopen
10:00 Accuracy competition closes. Submission platform stops accepting submissions. Teams can prepare the presentation for the business track.
11:00 Begin presentations
13:00 Jury delivery
14:30 Closing and optional lunch.
Participants and registration process
40 participants, 4 members per team.
To participate to the hackathon, start registering here. If you already have a team formed, indicated the name of it in the registration form. If you don’t have a team, register and we will assign you a team.
The deadline for registering is Monday 10th October. The acceptance process is first come first served. Once you or your team has been accepted, you will be notified about it.
Each participant can only participate in one team.
In order to opt to prizes, all teams must participate in both competitions.
This competition would not be possible without the help of SocialPoint. Thanks!
We are pleased to anounce our next event: How to build the data organization. In this talk we will know from 2 very important data driven organizations how to build the infraestructure and culture that enables the extraction of value from data inside a company.
It is a great honour to have in our panel Pau Contreras, Chief Data Officer at Unidad Editorial and Adam Kinney, VP of Data Science at Schibsted Media Group.
Pau Contreras is the Chief Data Officer at Unidad Editorial, where his job is generating the maximum value from its rich data ecosystem (El Mundo, Marca, Expansión, Telva), leading the strategic development of the people, the systems (Big Data, Data Engineering, Data Science, Digital Analytics) and the business models (Data Monetization) required for the digital transformation of the group. Prior to that, Pau held several positions at Oracle, where he was a Senior Director for Strategic Sales where he led Big Data Analytics sales in several industries. Pau holds a degree in Anthropology from UB, a MSc in Software Engineering and Analytics from UPC, a PDG from IESE and PhD Thesis based on a Complex Adaptative Systems framework for measuring Organizational Innovation.
Adam Kinney is the VP of Data at Schibsted, the company behind sites like fotocasa.es, vibbo.es, infojobs.net and many other marketplace and media sites worldwide. He comes from Twitter where he was Head of Analytics, leading a department of over thirty data scientists and engineers. Before Twitter, Adam worked at Google for 5 years in the area of data science and analytics. In Schibsted, he build-up the data organisation in record time with now over fifty data scientists and engineers distributed over London, Oslo, Stockholm, and Barcelona. The data organisation works with all the Schibsted companies in the group to turn data into insights and into data-driven products at scale (like recommendation, profiling for audience targeting, experimentation, natural language processing and image processing). He is currently based in Oslo and before that spent 12 years in San Francisco.
BCNanalytics co-founder, Manuel Bruscas, was in charge of presenting the topic of the day “Analytics and Big Data in Retail”. He explained that in the US alone the retail industry generates $966 billion a year, almost 6% of their GDP. Retail is currently facing several challenges, some of which are the product of technological progress. Through online marketplaces and company websites consumers can now consider a wide range of product sources (online and offline). This has directly led into deep changes within Retail, especially for companies with an offline presence, since they must quickly adapt to changing trends, for example the drop in footfall for physical stores. Technology has also empowered consumers to be more informed than ever, which means that competition will only increase. Therefore companies that wish to remain relevant must learn how to get the most out of their online and/or offline presence through an integrated data-driven strategy.
Jaume Portell (CEO of Beabloo)
Beabloo started in 2008 with the mission of creating new ways of communication between consumers and service providers (e.g. retail companies, government offices, theme parks, etc.). Jaume started his talk with a reflection of how in the early 2000s ecommerce, and the troves of data it was generating, started changing the landscape for physical retail. It was that realization that led Jaume and his colleagues to take action and found Beabloo.
He then took the topic back to present day, where after more than a decade of online growth we are seeing a shift back into offline. A perfect example of this is Amazon, which is now opening physical stores and signaling that this is a real trend, which is being fueled by the fact that still most of transactions are happening outside the web. To fully understand why offline continues to dominate the landscape, it is necessary to understand the customer. Data is the instrument through which we must connect both the online and offline worlds.
He made it a recurrent point that humans have a general need for physical stimulus. While we might not always need it, there will always be occasions in which we will want to see, feel, smell, taste and hear the products we want to buy before we purchase them. This is visible in the statistics of offline and online transactions. After twenty years of e-commerce evolution, still today 90% of transactions in the US and Europe happen offline. This is a gap that the web will hardly ever breach and therefore we need to adapt to it.
His proposed solution starts by recognizing that both worlds are not actually separated, rather they need to be highly integrated and coordinated to ensure that the most value is derived from them. These days technology is adding new capabilities to allow companies to better relate online and offline such as: using smartphones to understand how customers are navigating both worlds and creating campaigns to generate cross-environment awareness. The result of this integrated approach is an abundance of data, processes and metrics that can only be managed by the development of new platforms.
To show what he meant, he shared with us an example which covered the NBA stores in China. Beabloo was hired to solve two problems, the first was related to inventory and how to maximize their physical stores offer while minimizing expenses. The second problem was to develop a strategy of communication to customers visiting the stores. They proposed and implemented a solution that actually attacked both problems which was to develop a kiosk that allows customers to browse information of the store and develop customized products. Through the kiosk the stores offered and almost unlimited stock of products and by tying it to Beabloo’s platform they could have a cohesive communication at a customer level, which would translate into personal promotions, offers or information.
Víctor Martínez de Albeniz
(Professor of Production, Technology and Operations Management at IESE)
Víctor began by reminding us of how difficult the competitive landscape is for Retail. There is a huge level of complexity that companies need to manage that go across many of the departments of a company (communications, design, production, logistics, partner relationships, etc.). Another reality of Retail is the volatility of a product lifespan and the implications that has on planning and investment. If you don’t plan for enough you miss out, but if you over-forecast then you end with unused stock, also known as losses. Therefore, how can Retail companies manage this tough environment? His answer is: use data.
The good news (also, old news) for Retailers is that a lot of data is readily available through the cash register. This will already give them a very good amount of insight into what product is sold and in which quantities. Yet nowadays Retailers have additional data sources that can allow them to get data not just on what they sold, but also on what products caught their customers’ attention but ultimately did not sell. One method he suggested for getting this data is through the use of cameras to track how customers move through your stores. Also wifi can be used to map how customers are moving in a store and how they traverse your physical space. He developed on these methodologies by highlighting that while these systems are anonymous, they still allow for anonymous-CRM which can create powerful offers. For this to actually happen, all these multiple data sources need to be tied up with your internal data sources to create a cohesive customer data ecosystem. That is not an easy problem, but a huge opportunity for those that develop a way to do this with scale.
He then gave an example of an analysis his team is involved in, which is measuring the impact of weather on sales for street and mall stores. This analysis can go to different levels, but starts at overall sales and then moves into individual product groups. As it turns out weather has a measurable effect on how sales will go on a daily level as well as on the type of products that will move. Knowing this can be used to maximize the revenue and sales.
To finalize his talk he gave a second example on how clickstream data can be used to find the optimal pricing for new products. The methodology he proposed is mostly applicable to companies that have a lot of clickstream-data available and would rely on measuring through a test what are the clickstream results for a new product compared to a baseline drawn from the launch of similar products. Through this method, pricing can be set, not blindly, but rather with a good initial insight of where it should stand.
Nick Brittain (Director of Strategic Relationships at First Insight)
First Insight is a company that provides insights to Retailers. The main problem that they try to solve is how should Retailers plan for sales of new products (meaning expected sales volumes by price). He explained how this is an incredibly difficult question to answer, but through analytics Retailers can go a long way.
Nick explained how First Insight is using the “wisdom of the crowd” to help their customers. “Wisdom of the crowd” can be described as independently asking a group of people a question, then translating those answers into a quantitative measure and finally aggregating all the response data to use the resulting distribution as an estimate. As he explained, this methodology relies on a frequentist analysis of the data, that needs to account for several levels of complexity like how to determine a sufficient sample, how to weigh different answers according to what you know of the respondents (i.e. it may be a good idea to weight a happy customer different than an unhappy customer). He continued discussing the complexity of their methodology since analytics is not the only challenge that they face. A big part of the problem is how to collect data in a way that would not bias the customers. While the final challenge is how to visualize the data in a way that allows for the insights to be extracted.
His final point was to highlight how through a methodology like this one, you get the advantage of testing the results from the predictions. This result validation allows for continuous improvement to every step of the funnel, which can lead to very reliable estimates.