Full Mega-PAW 7-Track Agenda – Detailed Session Descriptions
Predictive Analytics World
June 16-20, 2019 – Caesars Palace, Las Vegas
This page shows the full 7-track agenda for the five co-located conferences at Mega-PAW. Mega Pass registration is required for full access. To view the agenda for one individual conference, click here: PAW Business, PAW Financial, PAW Healthcare, PAW Industry 4.0, or Deep Learning World.
Session Levels:
Blue circle sessions are for All Levels
Red triangle sessions are Expert/Practitioner Level
A veteran applying deep learning at the likes of Apple, Samsung, Bosch, GE, and Stanford, Mohammad Shokoohi-Yekta kicks off Mega-PAW 2019 by addressing these Big Questions about deep learning and where it's headed:
- Late-breaking developments applying deep learning in retail, financial services, healthcare, IoT, and autonomous and semi-autonomous vehicles
- Why time series data is The New Big Data and how deep learning leverages this booming, fundamental source of data
- What's coming next and whether deep learning is destined to replace traditional machine learning methods and render them outdated
A veteran applying deep learning at the likes of Apple, Samsung, Bosch, GE, and Stanford, Mohammad Shokoohi-Yekta kicks off Mega-PAW 2019 by addressing these Big Questions about deep learning and where it's headed:
- Late-breaking developments applying deep learning in retail, financial services, healthcare, IoT, and autonomous and semi-autonomous vehicles
- Why time series data is The New Big Data and how deep learning leverages this booming, fundamental source of data
- What's coming next and whether deep learning is destined to replace traditional machine learning methods and render them outdated
A veteran applying deep learning at the likes of Apple, Samsung, Bosch, GE, and Stanford, Mohammad Shokoohi-Yekta kicks off Mega-PAW 2019 by addressing these Big Questions about deep learning and where it's headed:
- Late-breaking developments applying deep learning in retail, financial services, healthcare, IoT, and autonomous and semi-autonomous vehicles
- Why time series data is The New Big Data and how deep learning leverages this booming, fundamental source of data
- What's coming next and whether deep learning is destined to replace traditional machine learning methods and render them outdated
A veteran applying deep learning at the likes of Apple, Samsung, Bosch, GE, and Stanford, Mohammad Shokoohi-Yekta kicks off Mega-PAW 2019 by addressing these Big Questions about deep learning and where it's headed
- Late-breaking developments applying deep learning in retail, financial services, healthcare, IoT, and autonomous and semi-autonomous vehicles
- Why time series data is The New Big Data and how deep learning leverages this booming, fundamental source of data
- What's coming next and whether deep learning is destined to replace traditional machine learning methods and render them outdated
A veteran applying deep learning at the likes of Apple, Samsung, Bosch, GE, and Stanford, Mohammad Shokoohi-Yekta kicks off Mega-PAW 2019 by addressing these Big Questions about deep learning and where it's headed:
- Late-breaking developments applying deep learning in retail, financial services, healthcare, IoT, and autonomous and semi-autonomous vehicles
- Why time series data is The New Big Data and how deep learning leverages this booming, fundamental source of data
- What's coming next and whether deep learning is destined to replace traditional machine learning methods and render them outdated
In the United States, between 1500 and 3000 infants and children die due to abuse and neglect each year. Children age 0-3 years are at the greatest risk. The children who survive abuse, neglect and chronic adversity in early childhood often suffer a lifetime of well-documented physical, mental, educational, and social health problems. The cost of child maltreatment to American society is estimated at $124 - 585 billion annually.
A distinctive characteristic of the infants and young children most vulnerable to maltreatment is their lack of visibility to the professionals. Indeed, approximately half of infants and children who die from child maltreatment are not known to child protection agencies before their deaths occur.
Early detection and intervention may reduce the severity and frequency of outcomes associated with child maltreatment, including death.
In this talk, Dr. Daley will discuss the work of the nonprofit, Predict-Align-Prevent, which implements geospatial machine learning to predict the location of child maltreatment events, strategic planning to optimize the spatial allocation of prevention resources, and longitudinal measurements of population health and safety metrics to determine the effectiveness of prevention programming. Her goal is to discover the combination of prevention services, supports, and infrastructure that reliably prevents child abuse and neglect.
In the United States, between 1500 and 3000 infants and children die due to abuse and neglect each year. Children age 0-3 years are at the greatest risk. The children who survive abuse, neglect and chronic adversity in early childhood often suffer a lifetime of well-documented physical, mental, educational, and social health problems. The cost of child maltreatment to American society is estimated at $124 - 585 billion annually.
A distinctive characteristic of the infants and young children most vulnerable to maltreatment is their lack of visibility to the professionals. Indeed, approximately half of infants and children who die from child maltreatment are not known to child protection agencies before their deaths occur.
Early detection and intervention may reduce the severity and frequency of outcomes associated with child maltreatment, including death.
In this talk, Dr. Daley will discuss the work of the nonprofit, Predict-Align-Prevent, which implements geospatial machine learning to predict the location of child maltreatment events, strategic planning to optimize the spatial allocation of prevention resources, and longitudinal measurements of population health and safety metrics to determine the effectiveness of prevention programming. Her goal is to discover the combination of prevention services, supports, and infrastructure that reliably prevents child abuse and neglect.
In the United States, between 1500 and 3000 infants and children die due to abuse and neglect each year. Children age 0-3 years are at the greatest risk. The children who survive abuse, neglect and chronic adversity in early childhood often suffer a lifetime of well-documented physical, mental, educational, and social health problems. The cost of child maltreatment to American society is estimated at $124 - 585 billion annually.
A distinctive characteristic of the infants and young children most vulnerable to maltreatment is their lack of visibility to the professionals. Indeed, approximately half of infants and children who die from child maltreatment are not known to child protection agencies before their deaths occur.
Early detection and intervention may reduce the severity and frequency of outcomes associated with child maltreatment, including death.
In this talk, Dr. Daley will discuss the work of the nonprofit, Predict-Align-Prevent, which implements geospatial machine learning to predict the location of child maltreatment events, strategic planning to optimize the spatial allocation of prevention resources, and longitudinal measurements of population health and safety metrics to determine the effectiveness of prevention programming. Her goal is to discover the combination of prevention services, supports, and infrastructure that reliably prevents child abuse and neglect.
In the United States, between 1500 and 3000 infants and children die due to abuse and neglect each year. Children age 0-3 years are at the greatest risk. The children who survive abuse, neglect and chronic adversity in early childhood often suffer a lifetime of well-documented physical, mental, educational, and social health problems. The cost of child maltreatment to American society is estimated at $124 - 585 billion annually.
A distinctive characteristic of the infants and young children most vulnerable to maltreatment is their lack of visibility to the professionals. Indeed, approximately half of infants and children who die from child maltreatment are not known to child protection agencies before their deaths occur.
Early detection and intervention may reduce the severity and frequency of outcomes associated with child maltreatment, including death.
In this talk, Dr. Daley will discuss the work of the nonprofit, Predict-Align-Prevent, which implements geospatial machine learning to predict the location of child maltreatment events, strategic planning to optimize the spatial allocation of prevention resources, and longitudinal measurements of population health and safety metrics to determine the effectiveness of prevention programming. Her goal is to discover the combination of prevention services, supports, and infrastructure that reliably prevents child abuse and neglect.
In the United States, between 1500 and 3000 infants and children die due to abuse and neglect each year. Children age 0-3 years are at the greatest risk. The children who survive abuse, neglect and chronic adversity in early childhood often suffer a lifetime of well-documented physical, mental, educational, and social health problems. The cost of child maltreatment to American society is estimated at $124 - 585 billion annually.
A distinctive characteristic of the infants and young children most vulnerable to maltreatment is their lack of visibility to the professionals. Indeed, approximately half of infants and children who die from child maltreatment are not known to child protection agencies before their deaths occur.
Early detection and intervention may reduce the severity and frequency of outcomes associated with child maltreatment, including death.
In this talk, Dr. Daley will discuss the work of the nonprofit, Predict-Align-Prevent, which implements geospatial machine learning to predict the location of child maltreatment events, strategic planning to optimize the spatial allocation of prevention resources, and longitudinal measurements of population health and safety metrics to determine the effectiveness of prevention programming. Her goal is to discover the combination of prevention services, supports, and infrastructure that reliably prevents child abuse and neglect.
The research on the state of Big Data and Data Science can be truly alarming. According to a 2019 NewVantage survey, 77% of businesses report that "business adoption” of big data and AI initiatives are a challenge. A 2019 Gartner report showed that 80% of AI projects will “remain alchemy, run by wizards” through 2020. Gartner also said in 2018 that nearly 85% of big data projects fail. With all these reports of failure, how can a business truly gain insights from big data? How can you ensure your investment in data science and predictive analytics will yield a return? Join Dr. Ryohei Fujimaki, CEO and Founder of data science automation leader dotData, to see how Automation is set to change the world of data science and big data. In this keynote session, Dr. Fujimaki will discuss the impact of Artificial Intelligence and Machine Learning on the field of data science automation. Learn about the four pillars of data science automation: Acceleration, Democratization, Augmentation and Operationalization, and how you can leverage these to create impactful data science projects that yield results for your business units and provide measurable value from your data science investment.
The research on the state of Big Data and Data Science can be truly alarming. According to a 2019 NewVantage survey, 77% of businesses report that "business adoption” of big data and AI initiatives are a challenge. A 2019 Gartner report showed that 80% of AI projects will “remain alchemy, run by wizards” through 2020. Gartner also said in 2018 that nearly 85% of big data projects fail. With all these reports of failure, how can a business truly gain insights from big data? How can you ensure your investment in data science and predictive analytics will yield a return? Join Dr. Ryohei Fujimaki, CEO and Founder of data science automation leader dotData, to see how Automation is set to change the world of data science and big data. In this keynote session, Dr. Fujimaki will discuss the impact of Artificial Intelligence and Machine Learning on the field of data science automation. Learn about the four pillars of data science automation: Acceleration, Democratization, Augmentation and Operationalization, and how you can leverage these to create impactful data science projects that yield results for your business units and provide measurable value from your data science investment.
The research on the state of Big Data and Data Science can be truly alarming. According to a 2019 NewVantage survey, 77% of businesses report that "business adoption” of big data and AI initiatives are a challenge. A 2019 Gartner report showed that 80% of AI projects will “remain alchemy, run by wizards” through 2020. Gartner also said in 2018 that nearly 85% of big data projects fail. With all these reports of failure, how can a business truly gain insights from big data? How can you ensure your investment in data science and predictive analytics will yield a return? Join Dr. Ryohei Fujimaki, CEO and Founder of data science automation leader dotData, to see how Automation is set to change the world of data science and big data. In this keynote session, Dr. Fujimaki will discuss the impact of Artificial Intelligence and Machine Learning on the field of data science automation. Learn about the four pillars of data science automation: Acceleration, Democratization, Augmentation and Operationalization, and how you can leverage these to create impactful data science projects that yield results for your business units and provide measurable value from your data science investment.
The research on the state of Big Data and Data Science can be truly alarming. According to a 2019 NewVantage survey, 77% of businesses report that "business adoption” of big data and AI initiatives are a challenge. A 2019 Gartner report showed that 80% of AI projects will “remain alchemy, run by wizards” through 2020. Gartner also said in 2018 that nearly 85% of big data projects fail. With all these reports of failure, how can a business truly gain insights from big data? How can you ensure your investment in data science and predictive analytics will yield a return? Join Dr. Ryohei Fujimaki, CEO and Founder of data science automation leader dotData, to see how Automation is set to change the world of data science and big data. In this keynote session, Dr. Fujimaki will discuss the impact of Artificial Intelligence and Machine Learning on the field of data science automation. Learn about the four pillars of data science automation: Acceleration, Democratization, Augmentation and Operationalization, and how you can leverage these to create impactful data science projects that yield results for your business units and provide measurable value from your data science investment.
The research on the state of Big Data and Data Science can be truly alarming. According to a 2019 NewVantage survey, 77% of businesses report that "business adoption” of big data and AI initiatives are a challenge. A 2019 Gartner report showed that 80% of AI projects will “remain alchemy, run by wizards” through 2020. Gartner also said in 2018 that nearly 85% of big data projects fail. With all these reports of failure, how can a business truly gain insights from big data? How can you ensure your investment in data science and predictive analytics will yield a return? Join Dr. Ryohei Fujimaki, CEO and Founder of data science automation leader dotData, to see how Automation is set to change the world of data science and big data. In this keynote session, Dr. Fujimaki will discuss the impact of Artificial Intelligence and Machine Learning on the field of data science automation. Learn about the four pillars of data science automation: Acceleration, Democratization, Augmentation and Operationalization, and how you can leverage these to create impactful data science projects that yield results for your business units and provide measurable value from your data science investment.
Track Sponsored by
Multiple surveys show that operationalizing data science, advanced analytics and AI is a major barrier to data-driven decision-making in organizations. Getting even actionable insight across the "last mile" and into operations is hard. In 2018 McKinsey identified that leaders in advanced analytics not only focused on the last mile, they behaved differently. Specifically, they didn't start with the data, but with the decision-making they hoped to change.
In this session that kicks off the Operationalization track, James Taylor analyzes why the last mile is so hard, shares the research that shows how important to success this last mile is, and outlines a practical approach to working backwards to success with data science.
Track Sponsored by
Customer Lifetime Value (CLV) is considered one of the most useful measures for business to consumer (B2C) companies, and is usually considered more valuable than other measures like conversion rate, average order value, and purchase frequency. If an accurate measure of CLV can be obtained, companies can determine which customers to prioritize with marketing messages and discount offers.
Basic CLV is actually quite easy to compute. But more sophisticated analysts and statisticians use parametric models that take into account purchase frequency, purchase recency, churn risk, and even customer age. These models can provide value estimates 5, 8, and even more than 10 years into the future. However, most retailers, while interested in lifetime value, are especially interested in estimating near-term customer value so they can create effective marketing strategies now.
In this talk, SmarterHQ's founding Chief Data Scientist Dean Abbott describes non-parametric machine learning approaches to calculating customer value for retail that can accommodate additional measurements and features not typically used in CLV models. Model summaries and accuracy metrics for several retail clients will illustrate the effectiveness of this style of model.
Track Sponsored by
At Hopper, we predict airfare from a stream of 30 billion daily prices. In this session, we'll talk shop, covering our process for:
- Personalizing 30 million user conversations through push notifications
- Measuring user travel flexibility and recommending alternative flights and hotels
- Building trust with data
- Open problems
Understanding the intraday microstructure dynamics across different universes of stocks and markets is fundamental in designing any optimal trading strategy, especially for trading diverse portfolios of stocks. In this talk we discuss how modern machine learning techniques can be used in conjunction with dynamical modeling of the intraday phenomena to identify those trading strategies that move away from the general country/sector classifications but rather respect the stock's particular microstructure characteristics.
Bringing the benefits from AI efforts to the frontline workers continues to be a struggle across major healthcare organizations. We worked on a novel, practical approach to directly take on the workflows of healthcare workers. This session shares the successes and failures in our attempts, and the AI's introduced via this approach to achieve efficiency and/or outcome goals. This practical workflow approach uses AI as tools, hence can deploy various AI’s for a variety of problems including patient status tracking and task automation. AI's being directly in the workflow also enables continuous learning, process improvement, and optimization toward specific goals.
AI is framed by models, sensors and technologies. These often ignore the human who must deal with and trust AI outputs. How do we translate the mental models and senses that humans deploy daily into algorithms that take us from data to inference to action? With the explosion of sensors at the edge, how do we actually make sense at the edge? This presentation draws from a recent Intel study of over 250 people in manufacturing and its supporting ecosystem to explore what it takes to accelerate the adoption of Industry 4.0 in a systems of systems approach.
Over the last few years, convolutional neural networks (CNN) have risen in popularity, especially in the area of computer vision. Many mobile applications running on smartphones and wearable devices would potentially benefit from the new opportunities enabled by deep learning techniques. However, CNNs are by nature computationally and memory intensive, making them challenging to deploy on a mobile device. We explain how to practically bring the power of convolutional neural networks and deep learning to memory and power-constrained devices like smartphones. We’ll illustrate the value of these concepts with real-time demos as well as case studies from Google, Microsoft, Facebook and more. You will walk away with various strategies to circumvent obstacles and build mobile-friendly shallow CNN architectures that significantly reduce memory footprint.
11:20 am - 11:40 am
Black box algorithms. Test data with test results. Predictions with possibilities only. All of these are reminders of analytics teams that have not yet plugged into the “business.” They appear to be making progress, or at least they are busy, but the results are not tangible. They have not yet created value that can be measured and replicated. Although predictive analytics is a “must” for nearly every business today, there are few companies really putting predictive analytics to work for them.
Why are so many organizations developing predictive capabilities, but haven’t put them to use with their sales, marketing or operations people? Are companies really getting value from statistical predictions? If they are, how are they measuring that value and showing it on their bottom line? If not, what are they doing to close the gap between data science and everyday business.
In this session, we’ll evaluate 3 areas that are most neglected and hardest to deal with when putting predictive analytics to work and show you how to get your predictions used.
- Trust. Getting the users to trust the prediction of an algorithm is fraught with biases. “That prediction can’t be right because the data is all wrong.” “I don’t believe that customer will default next month; I am best friends with the CIO and I haven’t heard a word. Trusting the outcomes of the predictions is the first barrier to overcome.
- Teaching. Teaching sales and operations people to use the predictions can be your secret to having successful deployments of systems that use your predictions. Helping users understand the context around the predictions is essential.
- Technology. Automation and self-service are the keys to use of predictive analytics. It must be easy. It must produce results. IT is the it!
Theresa Kushner, partner in Business Data Leadership, comes with over 20 years of experience in deploying predictive analytics at IBM, Cisco, VMware and Dell.
Track 1: BUSINESS - Analytics operationalization & management
11:45 am - 12:05 pm
The success or failure of analytics and data science initiatives often hinges on whether those on the “front lines” of business actually use and follow them. In this talk the presenter will share ideas he has learned over the years that help maximize the chances of successful analytics deployment.
Marketing Predictive Models have seen significant growth in deployments over the past few years with many companies rolling them out for retailers. Marketing data provides many good examples of large robust datasets with clear target variables. It is a common step in model building to do dimensional reduction or variable selection of input fields in order to improve the quality of the models. At SmarterHQ, we have multiple clients with these models in production. Typically these have 100’s of input fields that have overlapping predictive power. To reduce this overlap, many different methods can be deployed including deviation limits, correlation thresholds, stepwise regression, etc. In this talk, we will discuss methods of input variable field selection and its impact on model quality on production data.
Twitter has amazing and unique content that is generated at an enormous velocity internationally. A constant challenge is how to find the relevant content for users so that they can engage in the conversation. Approaches span collaborative filtering and content based recommendation systems for different use cases. This talk gives insight into unique recommendation system challenges at Twitter's scale and what makes this a fun and challenging task.
11:20 am - 11:40 am
This project sets forth the work to be performed for the Customer Acquisition Model, aiming to score the entire through-the-door (TTD) populations and eventually make optimal lending decisions. The work will be divided into two phases: Phase 1 will concentrate on building a Minimum Viable Product (MVP). This phase will leverage existing techniques used currently, where applicable, including the target, data sources and transformations. Phase 2 will conduct another round of data exploration with the purpose of identifying additional transformations to increase model lift over what was already achieved in Phase 1.
11:45 am - 12:05 pm
Understanding the problem to be solved is the most critical element in a successful project. A model that gets 99.3% accuracy to the wrong question does not help the client. And not being able to explain why the results occurred, particularly after a change, does not lead to success. It leads to frustration and the inability to use the model. Safety National's first Data Science Project is a clear example of having the right people in the process at the right time.
Until recently, healthcare has not understood root causes of diseases well enough for prevention; the main approach has historically been to treat patients after onset. While primary prediction scoring systems are routine for CVD patients, the goal is to reach patients before primary events occur. Amgen and a startup partner are co-developing a machine learning solution that uses existing EMR data to develop statistical and machine learning models predicting secondary CVD events. Having more accurate risk prediction models could significantly impact approaches to disease prevention. Session will also cover role of partnership in sourcing, prototyping, piloting, and scaling novel technologies.
Many organizations are faced with the challenge of how to analyze their sensitive data without hosting it on any public cloud. This talk will focus on companies who collect data from their factory operations and are interested in predicting mechanical failures. The audience will get an overview of how to formulate their business problem, perform feature engineering and build a predictive maintenance model using R/Python.
Deep learning certainly has roots in the autonomous vehicle space. However, most trucking companies have a substantial investment in existing class 8 semi-trailer trucks that are not going to be replaced overnight. Trimble Transportation Mobility is using deep learning technologies, in conjunction with other advanced analytic techniques and state of the art DevOps approaches to help ensure the safe operation of trucking fleets. While it may be premature for many trucking fleets to embrace autonomous vehicles, TTM has made it possible for those same companies to leverage deep learning as a way to reduce costs and improve safety.
As organizations invest more in predictive analytics, machine learning and AI, they are seeking proven ways to maximize their return on this investment. Many find they are lagging behind in capturing the full value of ML and AI because they can't embed potentially game-changing algorithms into their front line systems and workflows. Industry research has identified that these laggards under-invest in operationalizing their analytics and fail to focus their ML investments on the decision-making that matters most to their business strategy.In this session, James Taylor from Decision Management Solutions and Nathan Patrick Taylor from Datarobot, will discuss how leading organizations are successfully driving ML and AI algorithms into frontline systems and workflows. This session will walk through proven techniques for developing decision understanding to be clear what you need your algorithms to do; show how automated ML delivers powerful algorithms that can be rapidly deployed and continuously updated; and then show how you can get your models over the last mile by combining your algorithms with rules-based guiderails and constraints.Join us to see how you can deliver business value from analytics and turn machine learning into business learning.
Businesses are continuing to grow their investments in predictive analytics, ML and AI to enable faster and more accurate decision-making by line of business users. Often, however, the ROI of these investments comes under scrutiny as organizations struggle with building the best possible algorithms and deploying them in a time-to-market manner.
In this session, Aaron Cheng, VP of Data Science from dotData will discuss and demonstrate an innovation in data science automation using AI-powered feature engineering and automated machine learning. You will see a hands-on demonstration of a new platform that is integrated with PySpark on Jupyter Notebook that radically simplifies the end-to-end data science process through the use of a single line of code. This innovation creates incredible opportunities to accelerate and democratize data science in the enterprise, driving the highest value and providing the ROI modern businesses need to justify their investment in Predictive Analytics.
How many .edu addresses are in your inbox right now? As organizations pursue digital transformation strategies, challenges related to finding and retaining analytical talent, objectively assessing the relevance of new, and emerging technology and engaging in deep and meaningful innovation with eventual payback, are common to all sectors of the economy. Deep, collaborative partnerships with universities can help mitigate many of these challenges. Dr. Camm is an associate dean who oversees two masters programs in analytics at the Wake Forest University School of Business and is also leading the creation of a new Center for Analytics Impact at Wake Forest. Throughout his 35 years in academia, Professor Camm has always focused on real-world problems and has actively engaged with companies including among others, Procter and Gamble, Owens Corning, GE, Tyco, Ace Hardware, Boar's Head, Brooks Running Shoes and Kroger. He will discuss the ways that organizations should be thinking about working with universities, but typically don’t – including research, innovation, "externships," training options, recruitment, and other strategic relationships. After this session, you will never look at universities the same way again.
Financial institutions have long excelled at "analysis." We are overrun with reports, dashboards, key performance indicators and other metrics, and many financial organizations have a history of using this information to make data-driven decisions. But it is no longer enough. With rapid advances in technology and ever increasing stores of data, the opportunity is present for data scientists to dig in and get their hands dirty building real products and services for actual customers. It is time to trade information for engineering. This talk explores the shift from a consulting, insights-generation mindset for analytics to one of data-driven software development and what that means for financial institutions. How can you best structure these new hybrid analytics-engineering teams and how should you set them loose to generate value for the organization? Come to hear more.
With major new players (Amazon, JP Morgan, Berkshire Hathaway), reconfigured players (CVS merged with Aetna), and lots of hospital consolidation, healthcare is going to change. We are on the cusp of a post-hospital era where advanced analytics will enable and support pay for performance, value-based purchasing, pricing optimization, wellness/disease management, evidence-based medicine, and workforce optimization. Meanwhile the government retools its metrics every few years and tries to keep up.
This keynote will confront the role of health analytics as a major force in the changing health care landscape. Professor Rossiter will explain how we are finally entering the post-hospital era, and how all of this will enable the long-awaited managed competition approach to health services delivery.
Field issue (malfunction) incidents are costly for the manufacturer’s service department. A normal telematics system has difficulty in capturing useful information even with pre-set triggers. In this session, Yong Sun will discuss how a machine learning, deep learning based predictive software/hardware system has been implemented to solve these challenges by 1) identifying when a fault will happen 2) diagnosing the root cause on the spot based on time series data analysis. Yong Sun will cover a novel technique for addressing a lack of training data for the neural network based root cause analysis.
Field issue (malfunction) incidents are costly for the manufacturer’s service department. A normal telematics system has difficulty in capturing useful information even with pre-set triggers. In this session, Yong Sun will discuss how a machine learning, deep learning based predictive software/hardware system has been implemented to solve these challenges by 1) identifying when a fault will happen 2) diagnosing the root cause on the spot based on time series data analysis. Yong Sun will cover a novel technique for addressing a lack of training data for the neural network based root cause analysis.
Predictive modeling using machine learning techniques is transforming every aspect of modern business. Traditional approaches to machine learning is a time-consuming, resource-intensive and highly error-prone process. Automated machine learning platforms can make the process of building highly accurate predictive models fast and efficient. In this session, we will show how Datarobot can collaborate with data scientists to quickly build hundreds of highly accurate predictive models in a transparent and flexible manner, generate deep insights and deliver immediate value to business with easy deployment options.
Come and learn how predictive analytics can be combined to Decision Management and Notation (DMN) standard to provide business end users with a simple and clear depiction of their business decision context. Using a loan pre-qualification example we will demonstrate how DMN can leverage predictive analytics models captured in PMML. We will also show how the audit data generated from the DMN execution can be used for dashboarding, business intelligence, or fed back into the predictive model. This approach will enable you to clearly demonstrate to business users how predictive models directly contributes value into the day to day business operational decisions.
The need for timely decisions in ever-changing, unique scenarios is carved out in the healthcare field, where leveraging AI is expected to become a $6.6 Billion market. diwo's Cognitive Framework provide pre-packaged, quantified decisions for healthcare, turning insights into the best possible action. Will include a demonstration of diwo’s cognitive decision making for Readmission.
More than a single idea or technology; Industry 4.0 is a confluence of many old and new concepts working in tandem to unleash new sources of value. Going beyond the hype and ambiguity to truly understand what it really means for a system to have cognitive capability, listeners will be empowered with concrete ideas for transitioning from basic automated systems to cognitive systems
Your advanced analytics efforts aren’t all that in synch with your overall business strategy. Don’t fret – it seemingly happens to everyone. How do you spot it? What should you do about it? In this session, we’ll briefly address how to spot the issue and a few techniques to achieve much improved alignment. Hint, we may suggest that’s it’s a bit crazy to try and find a single person with knowledge and expertise in math, statistics, programming, data wrangling, modeling AND exceptional business knowledge!
In a world where demand outpaces supply, finding and keeping analytics talent has become a real dilemma. Identifying the right mix of business skills and analytics skills can feel like an impossible search. With so many people looking for strong talent, it often becomes difficult to compete. How do you attract the right skills to your team to ensure a strong analytics capability? What types of levels, roles, and titles do you need? What are some of the ways to ensure you retain your analytics talent? This session will discuss different compositions of successful analytics teams, as well as titles, career paths, and tips to win at the salary game.
Today, with always more data at their fingertips, Machine Learning experts seem to have no shortage of opportunities to create always better models. Over and over again, research has proven that both the volume and quality of the training data is what differentiates good models from the highest performing ones.
But with an ever-increasing volume of data, and with the constant rise of data-greedy algorithms such as Deep Neural Networks, it is becoming challenging for data scientists to get the volume of labels they need at the speed they need, regardless of their budgetary and time constraints. To address this “Big Data labeling crisis”, most data labeling companies offer solutions based on semi-automation, where a machine learning algorithm predicts labels before this labeled data is sent to an annotator so that he/she can review the results and validate their accuracy.
Unfortunately, even this approach is not always realistic to implement, for example in the context of some industries, such as Healthcare, where obtaining even a single label can cost thousands of dollars.
There is a radically different approach to this problem which focuses on labeling “smarter” rather than labeling faster. Instead of labeling all of the data, it is usually possible to reach the same model accuracy by labeling just a fraction of the data, as long as the most informational rows are labeled. Active Learning allows data scientists to train their models and to build and label training sets simultaneously in order to guarantee the best results with the minimum number of labels. In this talk, I will cover both the promises and challenges of Active Learning, and explain why, all in one, Active Learning is a very promising approach to many industry problems.
Internally at SWA, predictive scores are delivered to the sales team via documents known as “Quick Start Guides.” The point of these guides is to take an analytics example and repeat it across different hypotheses about the business, 40 of them.
One example of this is a set of models we built to predict the trajectory of YoY growth for individual accounts to see if they will continue with the same YoY growth or go another direction.
While that information on it's own is a cool prediction it doesn't service the 'boots on the ground', so we built guides that help the users understand why an account has come to their attention and talking points for those influential attributes so the sales force can use them in client conversations.
Right now the projected gain is $15MM in incremental future revenue per year - just by focusing on educating the frontline sales force.
Many companies regardless of their size and years in business may not actually have an analytics team or may have a team of one. During my last speaking engagement at PAW I spoke with a great deal of folks who were interested in creating analytics units but didn't really know how to go about it or were under assumption that it would be very cost prohibitive. This is a case study of a team that consists of former spreadsheet guy, grad with a fresh masters in engineering, a former rocket scientist and two former workflow coordinators.
A frequent criticism of the use of machine learning models as compared to human analysis is that ML models are "black boxes" and uninterpretable. Recent advancements in the field of explainable AI allow us to understand what factors influenced both individual predictions and aggregate model behaviors. We will revisit a case study from another PAW conference on predicting hospital readmissions, except this time we will use open-source software and dive into the 'why' with various visualizations that explain the model's behavior.
We will discuss the approach used to learn topics and their place in a multi level hierarchy on hundreds of millions of text records. These methods are generalizable beyond the domain in which they were applied. We used a combination of supervised and unsupervised machine learning methods, which we will discuss at more length including the technologies, algorithms, and results.
Machine learning has been sweeping our industry, and the creativity it is already enabling is incredible. On the flip side there has also been the emergence of technology like Deep Fakes with the possibility to spread disinformation. As a tool maker, is our technology neutral, or are we responsible for creating technology for good? How should we be thinking about biases of multiple forms when training AI? What can go wrong when learning is applied to indiscriminate user data?
At Adobe we look at this problem from multiple angles, from weighing the positives of technology against their possible misuses, researching detection technology for manipulated images, assembling diverse teams of experts, and having internal training and reviews of technology around Artificial Intelligence.
3:55 pm - 4:15 pm
In the government contracting world, executives default to using domain knowledge to answer strategic questions. Industry experts are skeptical about using predictive analytics. But to remain an industry leader, Humana Military needs a broader perspective to diversify and grow its business.
In this talk, hear about how one executive's curiosity about analytics led to a great partnership between executives and data scientists. Through predictive analytics, we discovered new federal opportunities, uncovered what it takes to win contracts, scored our chances of win, and discovered potential partnerships. The result: expanding our view of what is possible and co-creating our future together.
Track 1: BUSINESS - Analytics operationalization & management
4:20 pm - 4:40 pm
Caesars Entertainment is the world's most geographically diversified casino-entertainment company with major revenue streams from restaurants, entertainment, and hotels in addition to gaming. Caesars' VP of Gaming, Data Science and Fraud Analytics will cover some of the predictive analytics questions Caesars faces and approaches used to address these questions. Topics covered include how Caesars is using deep learning to interpret visual data, predicting key marketing characteristics including future spending and profitability, using machine learning for fraud detection, applying predictive analytics to sportsbook decisions, and valuing entertainment's impact on other parts of the business.
Career rewards -- the long-term value of employment reflecting the trajectories of advancement and pay -- can be used strategically to motivate performance and improve retention. Too often, they are neglected by reward practitioners who focus on benchmarking, and are not the result of deliberate design. In this session, we'll use case studies to show how to measure both the strength and impact of career rewards to optimize the career component of total rewards. In addition, we will demonstrate a methodology that quantifies organization shape in a way that permits alignment with pay on an empirical basis. The session will demonstrate how advanced analytics can inform rewards strategy.
Subscription services have seen tremendous adoption and growth. FabFitFun and StitchFix are household names with valuations in the billions. One of the biggest keys to success in this exploding sector is AI-driven personalization. In this session, I’ll cover the most important ways predictive analytics is impacting subscription companies, from onboarding data to world-class recommenders and lots more. I’ll also walk you through how I helped a Fortune 200 subscription company: (1) Reduce the time required to deploy an AI model from months to hours, (2) Increase the team's throughput by more than 3x, and (3) Showcase data science know-how throughout the company (of thousands) .
In this session, I will provide an overview of logistic regression, GLM logistic regression, decision tree, random forest, gradient boosting, neural networks, etc. Then, a comparison will be made through a case study about building a full life cycle of predictive model based on insurance datasets. Since the business goal through the feature engineering stages are similar given the same case study, the comparison of each method, its advantages and disadvantages, will include the feature selection, model building, model validation, and model testing stages. Also, model implementation and interpretability will be discussed and compared. Finally, we will discuss implementation of these methods in Python and R.
Emergency departments have seen a dramatic increase in the number of visits from elderly patients. Many elderly use a personal emergency response system (PERS) to signal for help in case of an incident such as a fall or breathing problems. At Partners Healthcare, we are testing a predictive model that uses PERS data to predict elderly at high risk of emergency department visits. Clinical staff from our homecare program perform interventions with high-risk patients. This presentation will cover the development of the predictive model and its deployment in a randomized controlled trial.
3:55 pm - 4:15 pm
Industry 4.0 can suffer from a real-world application problem when industrial and manufacturing companies only view IoT as a solution during large-scale plant upgrades or new construction. This case study presents how a manufacturing company has been able to generate energy cost savings in the scale of $MM through targeted deployment of sensors and IoT connected equipment into existing, large, (and sometimes very old and dirty) machinery and factories. By lowering the threshold of what projects are considered worthy of an IoT investment, what was previously considered run-of-the-mill operations can suddenly provide insightful information and an exciting ROI.
4:20 pm - 4:40 pm
Plants in the electric utility sector face common operational challenges. They want to optimize output, lower operating costs, maintain reliability, and ensure safety-- and they need to meet all of these goals simultaneously.
In this session we'll focus on the opportunities for deploying machine learning and data science methods in a plant setting. The presentation will cover use cases, big data tools, and the implications for plant operation optimization.
Key takeaways:
- Machine learning for the plant: when useful, when not
- How the proliferation of tailored sensors and IOT are changing operations
- Importance of explainable AI
- Checklist for applying data science
At PayPal achieving four nines of availability is the norm. In the pursuit of exponentially complex additional nines we have recently embarked on applying deep learning to forecasting datacenter metrics. With little shared to the open community, this talk will shine light on how we apply Seq2Seq networks to forecasting CPU and Memory metrics at scale. In sharing ideas around building deep networks for forecasting; the talk will highlight how life of Data Scientists at PayPal has been greatly simplified by the use of Template Notebooks stitched into stateful and stateless pipelines using PayPal’s open source PPExtensions.
4:45 pm - 5:05 pm
Pacific Life has made great strides recently in adoption of analytics across the enterprise. This talk will discuss how the organization took talented and separate analytics practices, built a unified vision, accelerated insights and enhanced adoption at all levels. Specific take-aways for the audience will be around driving stakeholder buy-in, building consensus of vision, getting demonstrable value, and tracking iterative wins. Specific frameworks, anecdotes and examples will be used to engage the audience and create actionable best practices.
Track 1: BUSINESS - Analytics operationalization & management
5:10 pm - 5:30 pm
In the age of machine learning, when business stakeholders demand both high accuracy and transparency in predictive models, practitioners must adapt in terms of how they present findings. Evaluation must be applied at all stages in the machine learning workflow -- from the initial POC through the model deployed in production. Each stage places different demands on the metrics we choose, as well as how we communicate and interpret those metrics. This talk will explore this issue and help both developers and product managers navigate the machine learning evaluation landscape.
4:45 pm - 5:05 pm
We are living at the dawn of big social science. Just like physics has the particle accelerators and astronomy has orbital telescopes, social scientists can now harness big data and machine learning systems of immense complexity and cost to measure and predict what society is up. Dstillery had been building one such system for the last decade. In this session, I'll walk you through how the system evolved from it's roots in programmatic advertising, how we discovered we were at war with fraudulent data, and how we settled on our philosophy that making good decisions on hundreds of billions of individual pieces of data yields the best results, but at the cost of significant infrastructure and system complexity. Finally, we'll talk about how these shiny new systems don't replace traditional social science methodologies such as surveys, but instead supplement and reinforce them.
Track 2: TECH - Machine learning methods & advanced topics
5:10 pm - 5:30 pm
At Publishers Clearing House, we create – and deploy in real-time – micro-clusters in order to provide our customers the most relevant and curated experience. In this session, you will learn to create and deploy models that lead to higher customer engagement and LTV.
4:45 pm - 5:05 pm
Attribution is about crediting touchpoints in customer interactions with their impact in the sale process, hence the core element of performance marketing. But today, the choice of the model is often driven by subjective belief and guessing, rather than data and analytics. This explains why to date we often find in place relatively basic models, like last-click or last-non-direct. In this session, we will discuss the different models seen in practice, analyze how they perform in different contexts, explore are their core ideas (from statistics, game theory, marketing science and machine learning), and cover their pros & cons. Finally, we will discuss how to turn descriptive attribution into successful predictive analytics.
Track 3: CASE STUDIES - Cross-industry business applications of machine learning
5:10 pm - 5:30 pm
There’s a new sense of urgency from the C-Suite to capture more value from the company’s data. For many organizations, this means accelerating progress toward machine learning. But what does it take to go faster? And can you skip some of the steps in an otherwise steep learning curve?
Drawing on case studies in banking and utilities, this session will provide insights to:
- Recognize where a project fits in the data science lifecycle
- Avoid predictive analytics projects that waste time and money
- Create an action plan that helps you reap the benefits of machine learning
Wavelets have long been known for their strong ability in denoising and transforming signals, for example in speech recognition and image processing. More recently, Long Short Term Memory Networks (LSTMs) and Autoencoders started showing promising results in this space as well. Application is possible also to data sets generated by complex nonlinear processes with a low signal to noise ratio. Both Wavelets and LSTMs offer value through generating cleaner training data and ultimately driving deeper insights. In this session, we will show three concrete use cases: Analysis of Financial Time Series, Sales Forecasting and Credit Risk Analysis.
An emerging biotech company launched the first treatment option in an unestablished rare disease market. Extremely low prevalence, lack of physician awareness, no codified ICD-10 diagnosis code, and the lack of approved treatments resulted in significant mis-diagnosis, making the application of AI challenging. Addressing the challenge required combining first and third party de-identified data in a HIPAA-compliant workflow based on Swoop's prIvacy platform. Of the 84 start forms in the past 6 months, 24 (29%) were due to Swoop's AI model. Further validation is underway in a university hospital system, embedding predictions into clinical workflows to improve patient outcomes.
4:45 pm - 5:05 pm
Optoro’s three core data culture problems were the following:
• Fear of Data
• Inconsistent Use of Vocabulary and Metrics
• Data Mistrust
This presentation will outline the strategies that we used to combat these issues. This on-going endeavor is yielding benefits to Optoro, including
• Increased alignment on company goals
• Improved ease of communication between teams and with Senior Management
• Consistency across external messaging
The session helps participants understand the role of data, the importance of a data strategy in an organization, the types of business analytics to execute, and ten practical steps to develop a data strategy to improve business processes. We will explore how to conduct research to identify opportunities to minimize threats, manage risks, and improve performance.
The session will use the dataFonomics® (A Data to Information Economics Framework) methodology as a platform for developing a Data Strategy. Attendees will be given an overview of the framework, steps, practical knowledge on data analysis and how to develop a data strategy.
How much data is enough to build an accurate deep learning model? This one of the first and most difficult questions to answer early in any machine learning project. However, the quality and applicability of your data are more important considerations than quantity alone. This talk presents some insights and lessons learned for gauging the suitability of electronic health record (EHR) training data for a life underwriting project. You will see how to determine if more data might increase accuracy and how to identify any weaknesses a deep neural network might have as a result of your current training data.
There have been ten US recessions since 1950. When is the next one? The answer matters because asset prices plunge in recessions, which creates both risk and opportunity. Forecasters answer this question by looking at leading economic indicators. We translate the thinking of forecasters into machine learning solutions. This talk explains the use of recurrent neural networks, which excel at learning historical patterns that don’t repeat, but rhyme. Our model anticipates the Great Recession from past data and exhibits lower error than established benchmarks. The proposed approach is broadly applicable to other prediction problems such as revenue and P&L forecasting.
Wait a minute! Comedy at a machine learning conference? Yes, indeed, PAW Business has added Yoram Bauman, PhD, “the world’s first and only stand-up economist,” to the roster. Predictive analytics and economics?
As Earl Wilson famously said, "An economist is an expert who will know tomorrow why the things he predicted yesterday didn't happen."
When Yoram said he wanted to be a stand-up economist, his father infamously said, "You can't do that – there's no demand." And yet, Yoram has made a splash on TV and the stage, not to mention pursuing a serious economics career at the same time.
Come experience Yoram's stand-up session, "Knock Knock. Who’s There? A.I."
We’re in a global analytics arms race, where yesterday’s strategic advantage can quickly become tomorrow’s industry standard. To stay competitive, companies must continue to invest and evolve at an ever increasing rate.
In this keynote session, Disney Sr. Vice President of Revenue Management and Analytics, Mark Shafer, will discuss his 30-year rags to riches analytical journey, including lessons learned from being on the receiving end of analytics at People Express Airlines to building a science-based analytical team at The Walt Disney Company.
During his 23 years at Disney, Mark led an analytical transformation, starting by implementing Walt Disney World's first resort revenue management model to currently leading an Internal consulting team of more than 150+ employees responsible for supporting analytics across The Walt Disney Company, including Parks and Resorts, Media Networks (ABC, ESPN, Disney Channel, A&E Networks etc.), Studio Entertainment (The Walt Disney Studios, Disney Theatrical).
Leave with deep insights and practical advice on how to steer a successful analytics journey at your company.
Predictive modeling continues to play an important role in the claim process for Property & Casualty (P&C) insurers and Third Party Administrators (TPAs). This session focuses on the TPA environment utilizing the workers' compensation line of business as a case study into the rationale, implementation, and outcomes that follow the decision to deploy predictive modeling. The speaker will explain the environment of workers' compensation claims handling and the role of the TPA as it relates to assisting employers in managing risk. With this foundation, the session will move into how predictive modeling can be deployed in the claim process and specifically look at one TPAs efforts to increase efficiency and improve client outcomes. Participants of the session will walk away with the following information:
- Basics on the P&C TPA environment
- Opportunities for predictive modeling in the claim process
- Implementation challenges and pitfalls to consider regardless of industry
- Overcoming pitfalls
- Realizing outcomes in a dynamic environment
Data underlies all of our best efforts to evolve health care practices. Data, and lots of it, now come in many forms and from many sources. Data is the catalyst for the transition from volume-based, episodic care to value-based, personalized care. A workable data strategy has to account for a variety of data forms and sources. A good data strategy bakes in empathy for each individual represented by the data. And, a great data strategy ensures that any movement of data within the organization is reliable, timely, and makes provision for increased data asset value. Great data strategy is the foundation for improving the delivery and outcomes of our healthcare experience. Gerhard Pilcher will share insights, tips, and lessons learned from more than 20 years of work solving problems and providing guidance to many different types of complex organizations within the health care industry and beyond.
Turning data into a business advantage through optimization is the goal of most organizations. UPS has been on a twenty year journey to achieve this goal and has seen cost improvements reaching $1B annually. At the same time, UPS has been able to offer new products and services backed by data and analytics.
Deep neural networks provide state-of-the-art results in almost all image classification and retrieval tasks. This session will focus on the latest research on active learning and similarity search for deep neural networks and how they are applied in practice by the Verizon Media Group. Using active learning, we can select better images and substantially reduce the number of images required to train a model. It enables us to achieve state-of the art performance while substantially reducing cost and labor. By using triplet loss for similarity search, we can improve our ability to retrieve better images for shopping application and advertising.
70% of digital transformation initiatives are not reaching their intended goals; that translates to $900 billion lost last year alone. The gap between generated insight and action within the last mile is where many businesses get stuck. diwo (Data In, Wisdom Out) was engineered from the ground up to tackle the decision-making process directly, beginning with the last mile in mind while acknowledging a business’s previous trends to visualize their next best move. The new platform’s scalable design seamlessly integrates with existing data, making implementation simple. diwo effortlessly meets business owners where they are, optimizing the decisions they need to make today
How can you predict portfolio delinquency? How do you consider hundreds of factors in assessing the risk in your portfolio, proactively? Bill Rachilla, Senior Product Manager for Loven System’s diwo solution, discusses the answers to these questions through use of a cognitive decision making platform.
Machine learning techniques in healthcare often get the bad reputation of being Black box methods. This session will bust that myth to show how interpretability tools can give more confidence in a machine learning model, but also help to improve the insights you generate from it. This talk will cover best practices for using techniques such as feature importance, partial dependence, and explanation approaches. Along the way, we will consider different issues that may affect model interpretation and performance.
An introduction to the basics of Neural Networks (NN) in the Wolfram language and how NN layers can be connected into either Chains or Graphs to construct Deep Learning networks. We will cover basic constructs like optimization through Stochastic Gradient Descent, Encoders and Decoders, NN Layers of different types, Containers for the layers, the problem of overfitting, and examples from Wolfram Neural Net Repository. References include http://reference.wolfram.com/language/guide/NeuralNetworks.html and http://resources.wolframcloud.com/NeuralNetRepository/
Track Sponsored by
10:05 am - 10:25 am
Concerns are constantly being raised about what data is appropriate to collect and how (or if) it should be analyzed. There are many ethical, privacy, and legal issues to consider and no clear standards exist in many cases as to is fair and what is foul. This means that organizations must consider their own principles and risk tolerance in order to implement the right policies. This talk will cover a range of ethical, privacy, and legal issues that surround analytics today. It will frame big questions to consider while providing some of the tradeoffs and ambiguities that must be addressed.
Track 1: BUSINESS - Analytics operationalization & management
Track Sponsored by
10:30 am - 10:50 am
From predicting which candidates will make great employees and which employees are likely to leave the organization, to forecasting diversity trends and achieving pay equity, employers are increasingly turning to data science to help streamline their employment processes. Despite great promise, using data science in workplace management can expose employers to a crippling degree of legal risk and potential liability, if the relevant legal and ethical issues are not carefully considered. Join us for this engaging workshop as a data scientist and a lawyer from preeminent workplace law firm Jackson Lewis demonstrate how employers can unlock the full potential of leveraging data science to manage the workplace and avoid the unintended consequences of doing so.
Track Sponsored by
10:05 am - 10:25 am
As the global volume of data increases, the challenge of monetizing data is only growing. In fact, data is projected to increase ten-fold by 2025, and 25% will be real-time in nature, requiring sophisticated systems and processes to capture and utilize effectively. One of the most common business questions overheard at companies is how to leverage the value of “dead data.” Data monetization is “the collection and packaging of data (or data insights) for delivering value-added services or creating revenue-generating products”. As the term “value” suggests, data monetization goes beyond just selling or transferring data assets. Instead, the best data monetization practices include both direct strategies and indirect strategies. An indirect strategy may involve using data to improve customer experience, drive cross-selling, or improve performance, and a direct strategy may involve creating new sources of revenue with outside partners.
As the volume of data explodes, companies are finding creative ways to exploit this information. During this discussion, Lawrence will talk through simple steps to start leveraging the value of your data, with a specific focus on analytics initiatives.
Track 2: DATA - Data Strategies & data prep
Track Sponsored by
10:30 am - 10:50 am
This presentation provides insights on how to optimize marketing campaigns by predicting the responses. The original idea was implemented for a large insurance company’s marketing campaign. We modified and perfected the idea, iterated and perfected it for the internal marketing lead generation campaigns. In this case, we gained access to customers’ attributes from a 3rd party data provider and how responders’ responded to previous marketing campaigns. The attributes include: customer age, professions, preferred contact types, months, past campaigns, etc., the target variable was if the customer responded to a previous campaign and purchased an item. We developed multiple machine learning models such as Ensemble, Gradient Boost, etc. We selected the best model with the highest accuracy and finally created the appropriate label for the response. The process allowed us to gain access to a precise marketing list for the campaigns, improving the performance response 25%-30% from the previous campaigns.
Track Sponsored by
At Overstock.com, lack of data has never been an issue. We know everything from the color you search most, to which room you'll redesign next. We can see individuals transition from furnishing their first flat to building their dream home, but processing this data requires some serious firepower. It has fueled our focus on delivering real-time personalization through the unification of data and AI.
Tune in as Chris Robison and Ramsey Kail takes you through martech innovations in building a successful marketing technology infrastructure for instantaneous individualized marketing experiences.
The use of AI in decision making processes brings efficiency and data-driven results, but also risks. Machine learning creates models which make predictions based upon patterns learned from past data. The reasoning behind these decisions is not available to the users of the models, or recipients dealing with the consequences of the decisions. E.g., a sales person doesn't know why a business is a good lead, and credit doesn't know why credit is denied. This is a case study on adding explanations to machine learning algorithms, so that users will have greater confidence and insight into machine-driven decisions.
The speaker will review case studies from real-world projects that built AI systems that use Natural Language Processing (NLP) in healthcare. These case studies cover projects that deployed automated patient risk prediction, automated diagnosis, clinical guidelines, and revenue cycle optimization. He will also cover why and how NLP was used, what deep learning models and libraries were used, and what was achieved. Key takeaways for attendees will include important considerations for NLP projects including how to build domain-specific healthcare models and using NLP as part of larger machine learning and deep learning pipelines.
In today’s digital age, users expect a fast, reliable mobile experience. Degradations (also referred to as regressions) in mobile app performance affect not only user experience, but even hurt business metrics. However, existing mobile app release pipelines lack the necessary infrastructure to detect regressions in a mobile app's performance before it is rolled out to the world. At Uber, we are building a state-of-the-art mobile regression detection pipeline, with the goal to detect regressions as small as 1%. Our approach includes both technological innovation as well as employing machine learning along with statistical testing techniques to improve the sensitivity of the regression experiments.
On the forefront of deep learning research is a technique called reinforcement learning, which bridges the gap between academic deep learning problems and ways in which learning occurs in nature in weakly supervised environments. This technique is heavily used when researching areas like learning how to walk, chase prey, navigate complex environments, and even play Go. In this session, Martin Görner will detail how a neural network can be taught to play the video game Pong from just the pixels on the screen. No rules, no strategy coaching, and no PhD required. Martin will build on this application to show how the approach can be generalized to other problems involving a non-differentiable steps that cannot be trained using traditional supervised learning techniques.
This is a prelude to Martin’s full day workshop on Thursday, June 20th: Hands-On Deep Learning in the Cloud: Fast and Lean Data Science with Tensorflow, Keras, and TPUs.
11:20 am - 11:40 am
The age of extending consulting services to help firms find analytical insights in their data is coming to a close. As businesses and institutions become more savvy in mining their own data, the traditional insights generated from consulting services (both internal and external) is moving to a new paradigm - data science products. This talk explores this shift in the industry and what it means for analytics and data science professionals, including how rapid advances in machine learning and artificial intelligence technologies are necessitating changes in how we think about project management, professional services, and analytics delivery models.
Track 1: BUSINESS - Analytics operationalization & management
11:45 am - 12:05 pm
Innovative and impactful data science work happens when there is a mix of talented data science professionals, challenging business problems and (most importantly) data. In order to build data science solutions at scale however, the data fueling the analytical work must be clean and easily accessible to the advanced algorithms that will be leveraging it. This presentation will cover how the critical tasks of data acquisition, cleaning, storage and pipeline development must be considered when designing and operationalizing large scale data science solutions.
One of the biggest challenges in corporations is the training of new data scientists to build the most predictive models possible with a given data set and modeling algorithm. Following the approach he's developed teaching this critical topic area after more than 20 years of industry practice, Bob Nisbet will demonstrate the effectiveness in preliminary models of using a progressive series of common data preparation steps -- each on the same data set (KDD-Cup 1998 data set) -- including:
- Filling of missing values
- Derivation of "dummy variables"
- Feature selection
- Deriving custom variables, based on business insights, which become powerful predictors
- Showing how to incorporate time-series data as predictors of system response with a given prediction horizon
- Showing how different data conditioning operations (e.g. balancing and standardization) can generate very different predictive outcomes
11:20 am - 11:40 am
In the event industry, use of machine learning is not commonplace. This talk is on how UBM/Informa uses automated machine learning (AML) technology to improve their sales and marketing processes. This includes application areas such as identifying the most suitable marketing plan to maximize ROI, and forecasting the number of event pre-registrants. We employed an AML platform employed to build and deploy accurate machine learning models quickly. Informa is a leading business intelligence, academic publishing, knowledge and events business.
Track 3: CASE STUDIES - Cross-industry business applications of machine learning
11:45 am - 12:05 pm
Every year, corporations spend more than $250B on litigation in the US. The critical decisions - whether to litigate or settle or where to file suit - are often made the same way they were 100 years ago. To gain insight that companies could use to make informed decisions on legal proceedings, we built a predictive analytics engine. The approach, combining minimal viable prediction with data from thousands of patent appeal cases over 10 years, was developed to predict outcomes in future patent appeal cases. We think of it like Moneyball, but for a market 20x the size of Majors.
Predictive models are increasingly used for important decisions such as which customers may open a financial account. These decisions affect the opportunities available to customers and drive business results. To maintain the customers' trust, it is important to be able to explain individual predictions. Likewise, it is important to explain the model logic for business managers, compliance professionals and regulators who expect fair decisions. This is not easy when using advanced techniques such as ensemble models. Mr. Duke will share Experian's recent advances in explainable AI technologies, with results in credit risk modeling, synthetic identity detection and fraud prevention.
Healthcare has always used statistical analysis and analytic capabilities for accounting, reimbursement, actuarial and fiscal projection purposes. New developments in advanced statistical and predictive analytics techniques promise to revolutionize health and medical outcomes, and care delivery. These new techniques utilize modern machine learning and Artificial Intelligence methods to predict and prescribe at the individual level, instead of using traditional statistics. Learn how new machine learning techniques are being used for value-based purchasing, population health, healthcare consumerism and precision medicine. Peer into the future of Healthcare Data Science with predictions from industry leaders.
At GM we are committed to have a world with zero crashes, zero emissions and zero congestion. The causes of crashes are directly correlated to human intelligence, that is, learning from experience, adapting to new situations, and using knowledge to prevent accidents.
Accident Detection and Avoidance Systems (ADAS) are a big step taken by car manufacturers to prevent the accidents. The effects of ADAS on the human attention and perception differ depending on the road conditions as well as areas (i.e., rural vs urban roads).
In a pair of case studies, we examine ADAS systems and insurance underwriting risk.
Application of advanced analytics techniques like machine learning and deep learning is in the initial stages in airline industry. In this presentation, we outline how this technology is being applied and GE Aviation's experience in advancing this for the last 7 years.
Are you curious about how companies address the gaps in their employees’ data and analytics skills? As the Corporate Training lead at Metis, I’ve worked with a wide range of organizations – from blue chip financial services to boutique tech startups – to develop training programs that help build the competencies needed to successfully apply data science that leads to growth, innovation, and better decision making. In this talk, I’ll share some of these real-world scenarios and answer your questions about the ways training can help your company achieve its goals. Participants will learn:
Which data science and analytics skills are most in demand
How certain skills are evolving to meet market demand
How companies use training to solve common problems and achieve strategic goals
The core Bayesian idea, when learning from data, is to inject information — however slight — from outside the data. In real-world applications, meta-information is clearly needed — such as domain knowledge about the problem being addressed, what to optimize, what variables mean, their valid ranges, etc. But even when estimating basic features (such as rates of rare events), even vague prior information can be very valuable. This key idea has been re-discovered in many fields, from the James-Stein estimator in mathematics and Ridge or Lasso Regression in machine learning, to Shrinkage in bio-statistics and “Optimal Brain Surgery” in neural networks. It’s so effective — as I’ll illustrate for a simple technique useful for wide data, such as in text mining — that the Bayesian tribe has grown from being the oppressed minority to where we just may all be Bayesians now.
In the insurance and banking industries, the track record of contributions made by women continues to grow. This is helping pave the way for future female scientists and analytics leaders. Predictive analytics and machine learning are no exception. At this panel session, learn from women in these fields what they've learned along the way, their wins and losses, and how they are helping others do the same. Our expert panelists will address questions such as:
- How can you best fit in and stand up as a woman in predictive analytics and machine learning?
- What are the key elements of being successful women scientists in these fields?
- What are the key elements of being successful women analytics leaders?
- How can you best build and manage your analytics team as a female analytics leader?
- How can you increase the count of women in your analytics team, especially in leadership roles?
- What are the differences from other science and engineering fields in terms of male domination?
- How do you suggest balancing work and personal life?
Multiple studies and surveys reveal that the health care industry lags behind other major industries when it comes to the adoption of analytics. Questions for debate include whether or not this is a fair assessment; and, if so; why this is the case. Join our panel of experts as they explore the state of analytics in health care and discuss the obstacles and the opportunities for advancement with this important technology.
The core Bayesian idea, when learning from data, is to inject information — however slight — from outside the data. In real-world applications, meta-information is clearly needed — such as domain knowledge about the problem being addressed, what to optimize, what variables mean, their valid ranges, etc. But even when estimating basic features (such as rates of rare events), even vague prior information can be very valuable. This key idea has been re-discovered in many fields, from the James-Stein estimator in mathematics and Ridge or Lasso Regression in machine learning, to Shrinkage in bio-statistics and “Optimal Brain Surgery” in neural networks. It’s so effective — as I’ll illustrate for a simple technique useful for wide data, such as in text mining — that the Bayesian tribe has grown from being the oppressed minority to where we just may all be Bayesians now.
In this talk, Chandra Kahtri, Senior AI Scientist at Uber AI, formerly at Alexa AI, will detail various problems associated with Conversational AI such as speech recognition, language understanding, dialog management, language generation, sensitive content detection and evaluation and the advancements brought by deep learning in addressing each one of these problems. He will also present on the applied research work he has done at Alexa and Uber for the problems mentioned above.
Data Science and machine learning practices have been well established in the financial industry for decades, vastly expanding the depth, breadth, and speed of analysis within securities trading and risk management, to name a few. Open source libraries and commercial automated machine learning platforms are evolving fast and being a “quant” is no longer a requirement to access and implement these advanced techniques. In this talk, we will review the evolution of tools in this space and highlight several use cases with broad application across the financial industry.
The companies getting the most value from advanced analytics spend much more of their time and money embedding analytics into their core workflows than others. The most successful, in fact, spend more than half their analytics budget not to build analytics, but to deploy and operationalize it. Companies that don’t complete this last mile, those that stop once they have completed the core analytics, see their analytic investments go to waste. Join this expert panel to hear what you can do to make sure you can embed analytics in your front line and maximize the return on your analytics investment.
2:15 pm - 2:35 pm
"Build a better mousetrap and the world will beat a path to your door." Build a dozen mousetraps, with different triggers, bait, and alarms all meant for different species of mice, and the world will beat your door down. This presentation addresses the challenges of alert fatigue generated by successful predictive models. When your target audience is presented with multiple models recommending different and occasionally overlapping but important actions, there needs to be harmony in the message sent. This presentation examines the interactions of prioritization, frequency, severity development, consistency, messaging, and user socialization across multiple models for effective action.
2:40 pm - 3:00 pm
There have been ten US recessions since 1950. When is the next one? The answer matters because asset prices plunge in recessions, which creates both risk and opportunity. Forecasters answer this question by looking at leading economic indicators. We translate the thinking of forecasters into machine learning solutions. This talk explains the use of recurrent neural networks, which excel at learning historical patterns that don’t repeat, but rhyme. Our model anticipates the Great Recession from past data and exhibits lower error than established benchmarks. The proposed approach is broadly applicable to other prediction problems such as revenue and P&L forecasting.
How much data is enough to build an accurate model? This is often one of the first and most difficult questions to answer early in any machine learning project. However, the quality and applicability of your data are more important considerations than quantity alone. This talk presents some insights and lessons learned for gauging the suitability of electronic health record (EHR) training data for a desired project. You will see how to determine if more data might increase accuracy and how to identify any weaknesses a model might have as a result of your current training data.
The companies getting the most value from advanced analytics spend much more of their time and money embedding analytics into their core workflows than others. The most successful, in fact, spend more than half their analytics budget not to build analytics, but to deploy and operationalize it. Companies that don’t complete this last mile, those that stop once they have completed the core analytics, see their analytic investments go to waste. Join this expert panel to hear what you can do to make sure you can embed analytics in your front line and maximize the return on your analytics investment.
In applications like fraud and abuse protection, it is imperative to use progressive learning and fast retraining to combat emerging fraud vectors. However, somewhat unfortunately, these scenarios also suffer from the problem of late-coming supervision (such as late chargebacks), which makes the problem even more challenging! If we use a direct supervised approach, a lot of the valuable sparse supervision signal gets wasted on figuring out the manifold structure of data before the model actually starts discriminating newly emerging fraud. At Microsoft we are investigating unsupervised learning, especially auto encoding with deep networks, as a preprocessor that can help tackle this problem. An auto-encoding network, which is trained to reconstruct (in some sense) the input features through a constriction, learns to encode the manifold structure of the data into a small set of latent variables, similar to how PCA encodes the dominant linear eigen spaces. The key point is that the training of this auto-encoder happens with the abundant unlabeled data – it does not need any supervision. Once trained, we then use the auto-encoder as a featurizer that feeds into the supervised model proper. Because the manifold structure is already encoded in the auto-encoded bits, the supervised model can immediately start learning to discriminate between good and bad manifolds using the precious training signal that flows in about newly emerging fraud patterns. This effectively improves the temporal tracking capability of the fraud protection system and significantly reduces fraud losses. We will share some promising early results we have achieved by using this approach.
3:30 pm - 3:50 pm
Every company will live and die by the decisions they make, but none more than High Growth Start-ups. While start-ups are in high growth mode they have to make quick, meaningful decisions that have impact today and to ensure success they have to be insightful data informed decisions. This talk discusses the process of enabling everyone in a company with the ability and access to analytics. Discussing how you get non-technical users engaging with data and at the same time getting your technical data folks well versed in understanding the business beyond the numbers. With this data informed ecosystem any company can make efficient, informed decisions to drive the business.
Track 1: BUSINESS - Analytics operationalization & management
3:55 pm - 4:15 pm
Analytics has become extremely valuable as it enables businesses to analyze their data and drive data driven decisions by uncovering insights and predicting outcomes. In this talk, I will share my personal story on how to hire, build and maintain world-class analytics teams.
3:30 pm - 3:50 pm
Preeminent consultant, author and instructor Dean Abbott, along with Rexer Analytics president Karl Rexer, field questions from an audience of predictive analytics practitioners about their work, best practices, and other tips and pointers.
Track 2: DEPLOYMENT - Predictive model deployment & integration
3:55 pm - 4:15 pm
Today's organizations have billions of dollars riding on the accuracy and performance integrity of analytical models. With model performance becoming a strategic enabler and a potential source of liability, organizations need to manage the risks associated with analytics.
To manage these risks effectively and move beyond simple financial model or spreadsheet auditing, organizations need a system of controls around analytic model development. These analytics controls provide checks and balances around model selection, validation, implementation, and maintenance.
British Prime Minister Benjamin Disraeli once said, "There are three kinds of lies: lies, damned lies, and statistics." Hollywood is gradually coming around to data-driven decision-making, but some skepticism towards quantitative analysis still lingers on. This presentation will provide an overview on movie-related metadata and how data silos are starting to break down at studios. Additionally, AI/machine learning entertainment industry examples will be shared to show how an improving analytics culture is providing actionable insights to: (1) mitigate risk when green-lighting movies, (2) improve box office predictions by building better statistical models, and (3) drive profits with targeted marketing campaigns.
3:30 pm - 3:50 pm
While many businesses understand the value of advanced analytics in decision making, operationalizing data science can be challenging. This session reveals how Enova International has been successful at integrating traditional operations with advanced analytics to turn fraud defense into a collaborative analytics function. Over time, through combining the latest technologies in data, machine learning and decision automation with manual investigations, Enova International has been able to attract and retain top analytics talent, mitigate fraud risk, improve profitability and deliver a better customer experience.
3:55 pm - 4:15 pm
Many businesses determine customer lifetime value (CLTV) in order to plan how to attract and retain customers. Traditionally, they use descriptive analytics to determine the average CLTV. However, with the expectation of receiving personalized services, these methods are inadequate. Predicting how long a new customer is expected to stay as a customer, and consequently their expected CLTV, companies can make decisions on the best way to serve them. In this talk, we will discuss practical tips and lessons learned in building machine learning models for determining CLTV including pitfalls to avoid and how deployment affects the model selection.
With the advent of big data and machine learning, there is an opportunity to combat rising healthcare costs by leveraging data in an ethical and privacy compliant way to establish more consistency and implantation of preventative care. We need to ensure there is a fundamental set of rules and responsibilities in place among healthcare organizations to protect their patient's privacy. In this presentation we will address this challenge and speak to the importance of creating an ethical and privacy compliant approach to aggregating multiple data sources which then can be used to improve patient outcomes.
Advanced analytics transforms data into actionable insights, augmenting human decision making in manners previously impossible. At OSF Healthcare, properly formulated and deployed advanced analytics solutions allow us to save and improve more lives, decrease the cognitive load for our mission partners, improve our financial performance and generating transformative innovations.
This session will share some of the best examples of beneficial advanced analytics solutions over OSF Healthcare’s 5+ year journey in the space. Highlights will include clinical improvements achieved through the integration of risk modeling and decision support tools, operational efficiencies empowered by risk assessment automation and clinical information extraction through natural language processing.
Vistra Energy is one of the largest energy companies in the United States, owning both power generation and retail operations in extremely competitive markets throughout the country. We combine the big data opportunities we have available as a utility with advanced analytics to provide premium customer services that differentiate us from other utility providers. We will present three case studies demonstrating how we are able to leverage our firehose of 15-minute interval IOT device energy usage data from over 1.5 million customers with advanced modeling techniques to provide added value to our customers and increase brand loyalty.
Logs are a valuable source of data, but extracting knowledge is not easy. To get actionable information, it frequently requires creating dedicated parsing rules, which leaves the long-tail of less popular formats. Widely applying real-time pattern discovery establishes each log as its own event of a given type (pattern) with specific properties (parameters). This application makes it a tremendous input source for Deep Learning algorithms that filter out noise and present what’s most interesting. This talk reviews real-life cases where these techniques allowed to pinpoint important issues, and highlights insights on how best to elevate DL in the development lifecycle.
3:30 pm - 3:50 pm
Automated modeling is already in focus by practitioners. However, applications for marketing campaigns require significant effort in data preparation. To address this bottleneck, the robotic modeler integrates a front layer, which automatically scrolls executed campaigns and prepares data for modeling, with a machine learning engine. It enables for automated campaign backend modeling, generates scoring codes, and produces supportive documentations. The robotic modeler supports generalized deep learning assembling business targets and features. Systematically running the robotic modeler provides additional benefits including perceiving input feature importance from various campaigns or estimating cross-campaign effects. It empowers “hyper-learning” derived from campaign modeling.
4:20 pm - 4:40 pm
In the future, there will not be a shortage of doctors, lawyers, teachers, and accountants, there will be a shortage of people in those fields that can speak to technology. From cloud computing to mobile and social media, there is an explosion of data from technology and there is value trapped in siloed organizations where only a hand full of specialized people are empowered with the necessary skills to realize the full potential of data. The solution is to use customized and compelling, case studies to foster a practical understanding of data analytics. This talk will provide practical steps on how to build data science skills across different functions and disciplines in your organization.
Track 1: BUSINESS - Analytics operationalization & management
4:45 pm - 5:05 pm
Success in a data-driven world means empowering teams with science to improve decision making through confident, replicable and trainable programs that can engage an entire organization. Analytics teams that use a scientific approach to answer business questions will accelerate actionable insights and improve user experiences.
Peter and Martin will discuss their experience driving value in organizations including Lyft, Citrix, Alibaba and Bell where data science methods for growth and insights are at the forefront of the business. Data science is a team sport, the people in the business closest to the data often are in a position to know it best. Fostering an analytic mindset throughout the organization and training teams in a scientific approach to attack the problems they encounter will produce a needed competitive advantage.
Gain speed and agility in modeling solutions to the questions in your organization for a deeper understanding of the business landscape.
4:20 pm - 4:40 pm
There is a lot of information and best practices available so data scientists can build analytic models, but much less about how analytic models can best be integrated into a company's products, services or operations, which we call analytic operations. We describe three frameworks so that a company or organization can improve its analytic operations and explain the frameworks using case studies.
Track 2: DEPLOYMENT - Predictive model deployment & integration
4:45 pm - 5:05 pm
Many organizations utilize predictive models to make decisions but what happens when those models fail to deliver, or worse, are totally off? Having had to audit numerous models across diverse industries as an advanced analytics management consultant, Stephen Chen shares personal WTF experiences and distills the perils inherent in predictive modeling which are typically glossed over in data science courses and texts.
Using real world datasets to illustrate these issues, this session aims to help stakeholders better assess the suitability of models for decision-making, as well as helping practitioners think through their datasets and processes to build more robust models.
The restaurant industry in America is closing on $800b in annual revenue. We have more than a million locations, and we employ more than 14.7m employees*. But for all of that, 9 out of 10 of our managers started at the entry level. Ex-dishwashers, busboys, and hosts, now helping us to run an $800b a year business.
Predictive analytics has been a buzzword for a few years now. It has seen success for a wide range of applications. Within the Insurance industry, several applications have emerged, across underwriting, claims, marketing, and beyond. In this session, we will highlight examples of success factors that help sustain predictive analytics in a workers compensation insurance environment.
Independent Pediatricians typically maintain daily patient volumes of 20-30 patients to keep their practices viable. Pediatricians also schedule appointments up to a year in advance, leading to as many as 15% of patients not showing up for appointments each day. The financial and clinical impact of these gaps in pediatric appointment books is substantial.
PCC and Rexer Analytics analyzed pediatric no-show patterns to identify the variables that truly affect appointment truancy. These insights were translated into interventions to reduce patient truancy. We present pediatric no-show patterns, key predictors, and the results several Pediatric practices are seeing with targeted interventions.
Acronyms abound in the area of predictive analytics and machine learning is no exception. The discipline of predictive analytics has been used by businesses since the end of World War II. But machine learning has been at the core of this activity since its very early business applications. In these very early days, the "machine" itself and not the human was identifying the predictive algorithms that would optimize a given business solution albeit in a more simplistic manner. With the advent of Big Data, this concept of machine learning has now expanded to more complex forms
Deep learning models have shown great success in commercial applications such as self driving cars, facial recognition and speech understanding. However, typically these models require a large amount of labeled data, presenting significant hurdles for AI startups faced with a lack of data, funding and resources. In this session, I will discuss how to overcome the cold-start problem of deep learning by using transfer learning, synthetic data generation, data augmentation and active learning. This talk will go through a real use case of invoice processing and information extraction, which is a critical step in the Account Payable process.
4:20 pm - 4:40 pm
In the talk, I will give a detailed example how a seamlessly integrated, distributed Spark + Deep Learning system can reduce training cost by 90% and increase prediction throughput by 10X. With such a powerful tool in hand, a data scientist can process more data and get more data insight than a team of 20 data scientists with traditional tools.