Sunday, October 16, 2011
Half-day Workshop
Room: Madison
R Bootcamp: For Newcomers to R
Click here for the detailed workshop description
- Workshop starts at 1:00pm
- Afternoon Coffee Break at 2:30pm - 3:00pm
- End of the Workshop: 5:00pm
Instructor: Max Kuhn, Director, Nonclinical Statistics, Pfizer
[ Top of this page ] [ Agenda overview ]
Monday, October 17, 2011
Full-day Workshop
Room: Clinton
Predictive Analytics: Fundamentals and Use Cases
Click here for the detailed workshop description
- Workshop starts at 9:00am
- Morning Coffee Break at 10:30am - 11:00am
- Lunch provided at 12:30 - 1:15pm
- Afternoon Coffee Break at 2:30pm - 3:00pm
- End of the Workshop: 4:30pm
Instructor: Piyanka Jain, CEO, Aryng.com
Instructor: Puneet Sharma, Senior Manager, PayPal
[ Top of this page ] [ Agenda overview ]
Full-day Workshop
Room: Madison
R for Predictive Modeling: A Hands-On Introduction
Click here for the detailed workshop description
- Workshop starts at 9:00am
- Morning Coffee Break at 10:30am - 11:00am
- Lunch provided at 12:30 - 1:15pm
- Afternoon Coffee Break at 2:30pm - 3:00pm
- End of the Workshop: 4:30pm
Instructor: Max Kuhn, Director, Nonclinical Statistics, Pfizer
[ Top of this page ] [ Agenda overview ]
Tuesday, October 18, 2011
Full-day Workshop
Room: Madison
Driving Enterprise Decisions with Business Analytics
Click here for the detailed workshop description
- Workshop starts at 9:00am
- Morning Coffee Break at 10:30am - 11:00am
- Lunch provided at 12:30 - 1:15pm
- Afternoon Coffee Break at 2:30pm - 3:00pm
- End of the Workshop: 4:30pm
Instructor: James Taylor, CEO, Decision Management Solutions
[ Top of this page ] [ Agenda overview ]
Full-day Workshop
Room: Clinton
Hands-On Business Analytics: Insights to Impact
Click here for the detailed workshop description
- Workshop starts at 9:00am
- Morning Coffee Break at 10:30am - 11:00am
- Lunch provided at 12:30 - 1:15pm
- Afternoon Coffee Break at 2:30pm - 3:00pm
- End of the Workshop: 4:30pm
Instructor: Piyanka Jain, CEO, Aryng.com
Instructor: Puneet Sharma, Senior Manager, PayPal
[ Top of this page ] [ Agenda overview ]
Full-day Workshop
Room: Trianon
Hands-On Predictive Analytics with SAS Enterprise Miner
Click here for the detailed workshop description
- Workshop starts at 9:00am
- Morning Coffee Break at 10:30am - 11:00am
- Lunch provided at 12:30 - 1:15pm
- Afternoon Coffee Break at 2:30pm - 3:00pm
- End of the Workshop: 4:30pm
Instructor: Dean Abbott, President, Abbott Analytics
[ Top of this page ] [ Agenda overview ]
Wednesday, October 19, 2011
Exhibit Hall Open
Registration & Breakfast
9:00am-9:45am
Room: Gramercy
Keynote
Persuasion by the Numbers: Optimize Marketing Influence by Predicting It
Data driven marketing decisions are meant to maximize impact - right? Well, the only way to optimize marketing influence is to predict it. The analytical method to do this is called uplift modeling. This is a completely different animal from what most models predict: customer behavior. Instead, uplift models predict the influence on customer behavior gained by choosing one marketing action over another. The good news is case studies show ROI going where it has never gone before. The bad news? You need a control set... But you should have been using one anyway! The crazy part is that "marketing influence" can never be observed for any one customer, since it literally involves the inner workings of the customer's central nervous system. If influence can't be observed, how can we possibly model and predict it?
Speaker: Eric Siegel, Ph.D., Program Chair, Predictive Analytics World
[ Top of this page ] [ Agenda overview ]
9:45am-10:05am
Diamond Sponsor Presentation
Speeding Up Exploration
Discover more, and find out in time for decisions to matter more.
Well, Wall Street cares so much about making fast decisions that they are laying dedicated high-speed lines so that the data for program trades can be processed faster and orders executed more quickly. If Wall Street can do all this in milliseconds, won't it make a difference to you to get your indicators sooner, in seconds instead of hours, in time for a fast response to make a difference?
And fast visualization is a key here, too. Exploration relies on following clues, and the more things you look at, the faster you can follow clues, the more value you get from seeing your data. Even with big data, you can speed up exploration. More eyes, more scenarios. Instant gratification. No more waiting.
Speaker: John Sall, Co-Founder, SAS & Chief Architect, JMP
[ Top of this page ] [ Agenda overview ]
Break / Exhibits
10:45am-11:05am
Room: Gramercy B
Track 1: Insurance Applications
Case Study: Infinity Insurance
Next Generation Claims Systems
Today's insurance claims systems deliver value by increasing efficiency through process automation and workflow. The use of manual decision points within these processes and systems, however, creates bottlenecks and under-utilizes company expertise. Insight gained from data mining and analytics is restricted to the back office and IT resources cannot be freed up from legacy systems maintenance. This session will show how a number of insurers are using predictive analytics embedded in rules-based decision services to build a new generation of more effective claims systems.
Speaker: James Taylor, CEO, Decision Management Solutions
[ Top of this page ] [ Agenda overview ]
11:10-11:30am
Gold Sponsored Presentations
Room: Gramercy B
Big Data, Deep Analytics
Predictive analytics demands an extremely robust BI infrastructure to handle massive amounts of data and varying numbers of business users, along with query-intensive predictive-scoring workloads. Traditional EDWs consume large amounts of CPU resources to read every byte of every row of large database tables and deliver the query results. They also require complex indexing and summary tables to support query-intensive workloads, and more hardware and DBA resources are required to tune the queries. To solve these problems and enable predictive analytics, an analytics server is needed – one that is architected and optimized for massive data and complex models.
Sybase IQ is a market-leading analytics server enabling organizations to perform deep analysis of massive amounts of data, accessed by hundreds of users requiring answers in real-time. It is positioned in the leader's quadrant of the 2011 Gartner Data Warehouse Database Management System Magic Quadrant Report, and is the #1 column-store today with over 2,000 customers worldwide.
Speaker: David Wiseman, Director of Business Development, Sybase, An SAP Company
An Analytics Environment for the New Reality
There is a new reality in advanced analytics. We are simultaneously moving towards more real-time applications, with the opportunity to utilize the emerging big data, not resident in traditional data warehouses. This requires a new analytics environment, capable of dealing with both ends of the analytics spectrum.
Speaker: David Hastings, Director, Teradata Corporation
[ Top of this page ] [ Agenda overview ]
10:45am-11:30am
Room: Gramercy A
Track 2: Survey Analysis
Case Study: YMCA
Turning Member Satisfaction Surveys into an Actionable Narrative
Survey analysis often involves hand-tuned analysis requiring weeks of labor to decipher the key relationships in survey responses. Proper coding of responses, colinearity, and missing data plague analysts in their pursuit of clear explanations of responder intent in the surveys. Additionally, while traditional statistical analyses, such as factor analysis, linear and logistic regression, can be used effectively in modeling survey responses, these models do not resonate with the business community in the same way they do with statisticians.
The approach followed in this case study provides a narrative history of how classic statistical and data mining techniques were employed in the analysis of a large survey dataset and how the analytic strategy evolved over time. To address end user needs and display results in a manner that is intuitive to decision-makers the structure of the "Member Experience" was re-conceptualized into a six dimension hierarchy.
A validation of the re-conceptualizes theoretical structure along with live demos of actual data will be shown to illustrate summaries of the surveys, reveal strengths and weaknesses of branches, and suggest how branches can improve the member experience.
Speaker: Dean Abbott, President, Abbott Analytics
Speaker: Bill Lazarus, President & CEO, Seer Analytics, LLC
[ Top of this page ] [ Agenda overview ]
Room: Murray Hill A
Track 3: e-Commerce; Thought Leadership
Case Study: PayPal/eBay
Putting Predictive Analytics into Context: The Analytics Value Chain
In a product/services company, analytics generates its greatest value when a certain line-up of best practices is performed, ranging from gross intelligence to a more detailed understanding. This is achieved with a "three pillar" analytical approach: [Measurement Framework, Portfolio Analysis, and Customer Analysis]. Within each of these components, we move from a simpler "20,000 foot" view analysis, to deeper, more comprehensive analytics.
In this case study, these components in detail, along with the tools and techniques required and gotchas to look out for will be covered. Auxiliary intelligence such as VOC (Voice of the Customer) and Competitor / Industry / Economic landscape analysis, which delivers an [outside-in] view of the business, will also be covered.
What you will walk away with is:
- An understanding of the [analytics value chain], which sets predictive analytics into an impactful context
- Analytics your organization needs to better understand your business
- Tools and methodologies best suited for the [three pillars] of analysis
- Challenges to prepare for, as you embark on these analyses
- Organizational support needed for analytics execution.
Speaker: Piyanka Jain, CEO, Aryng.com
Speaker: Puneet Sharma, Senior Manager, PayPal
[ Top of this page ] [ Agenda overview ]
11:35am-12:20pm
Room: Grand Ballroom
Special Plenary Session
The Future of Targeting and On-Line Marketing – Predictive Analytics on Big Data
This session will show:
- The growth in search marketing eclipsed traditional brand advertising, for a while. However, there is a comeback that depends heavily on predictive analytics and targeting.
- The future is evolving in the kinds of marketing, how marketers think of targeting, and the importance of context and consumer intent.
- Inferring user intent from behavior, context, and apps that elicit explicit statement of intent is the way to the future or relevant advertising.
- The brave new world of on-line marketing and how it is beginning to break the mold of traditional advertising and marketing.
- For the first time, we are beginning to see new generation marketing approaches that truly leverage the interactive medium that represents on-line and mobile apps.
- What are the new trends? What are these new generation marketing approaches? And what is the role of predictive analytics in this new world?
[ Top of this page ] [ Agenda overview ]
Lightning Round of 2-minute Vendor Presentations
[ Top of this page ] [ Agenda overview ]
Birds of a Feather Lunch / Exhibits
1:35pm-2:20pm
Room: Gramercy B
Track 1: Thought Leadership
Case Study: LinkedIn
Data Science at LinkedIn: Iterative, Big Data Analytics and You
Companies who compete on analytics and delivering data-driven services need to iterate quickly on big data. This enables rapid data exploration to identify unknown relationships and trends to create new products and services. Come to this session to see how LinkedIn has created a core competency around analytics. Understand about the techniques and technologies LinkedIn data scientists use to create data-driven products. Get ideas of how to apply iterative big data analytics in your own organization and enable your own analytics center of innovation.
Speaker: Manu Sharma, Principal Research Scientist, LinkedIn
[ Top of this page ] [ Agenda overview ]
1:35pm-1:55pm
Room: Gramercy A
Track 2: Demand Forecasting
Case Study: Cox Communications
What Happens Next? Automated Smart Demand Forecasting
At Cox Communications, forecasting future demand by product line has become an essential business function to direct and regulate operational, marketing, and sales resources. This case study will focus on a forecasting process transformation and its resulting learnings that recently occurred in Cox's Central Region. In the past, forecasting results were generated by ARIMA and exponential smoothing techniques, modified by analysts, and delivered via an Excel report on a public drive. This report, known locally as the 'Daily Prophet', was recently transformed to a fully automated system producing predictions through neural networks processing lagged data, captured economic variables, and operational metrics. Additionally, errors are now captured and monitored using a re-purposed statistical process that provides notifications of inaccuracy or consistency. The transition to automation using a broader set of inputs, and the ability to monitor results, has not only brought increased efficiency to the business, but maintained trust in the transformation as well.
Speaker: Bob Wood, Director of Marketing Science, Cox Communications
[ Top of this page ] [ Agenda overview ]
Room: Murray Hill A
Track 3: Risk Management
Case Study: ACE Cash Express
Credit Risk Analytics Framework for Subprime Loans
Analytics is all about strategy and framework, rarely is it about tools or techniques. In this case study, I will be diving into the structural requirements of subprime credit risk analytics. We will be discussing a framework that cuts across the spectrum of corporate needs from loan origination to loss mitigation.
Top 3 takeaways:
- Unified framework to analyze subprime consumer loans,
- Evaluating when a strategic change is necessary,
- Methodically instituting the fine balance between limiting the risk and growing the top-line.
Subprime lending presents unique twists to these problems. In this session, I will be focusing on the best way to frame these problems.
Speaker: Senthil Ramanath, Head of Analytics, ACE Cash Express
[ Top of this page ] [ Agenda overview ]
2:00pm-2:20pm
Room: Gramercy A
Track 2: Forecasting (Per-Product); Retail
Case Study: A Top Global Retailer
Broad Scale Predictive Modeling Optimization in Marketing and Retail Sales
This case study will show how we created an automated, high-speed prediction/optimization system by leveraging data mining. Our system predicts retail sales on a product-by-product basis throughout a network of retail stores and is used for planning, logistics, and optimization with respect to pricing, promotion and assortment. There were many challenges, as we work with more than 100,000 products, operate networks of hundreds of brick and mortar stores and predictions must be updated frequently. Additionally there were constraints related to product promotion and contracts with suppliers that limited our flexibility. We overcame our challenges and achieved new levels of accuracy and reliability.
Speaker: Felipe Fernandez, CEO & Partner, Interefe (Brazil)
[ Top of this page ] [ Agenda overview ]
Room: Murray Hill A
Track 3: Direct Marketing New Products
Case Study: A Top 5 International Bank
Product Testing in Financial Services: Financial Analytics
Credit Marketing has come a long way in today's economy of hard-hitting competition and diminishing customer loyalty. With the increasing level of cut-throat competition, decreasing customer loyalty and the increasing commoditization of banking products, it has become essential in today's sluggish economy for banks to proactively understand the changing customer preference. Understanding the changing customer preference can help build a value proposition for the Bank since banks today are flexible enough to align their products towards the value needs of their customers. For example, the rapid launch of a new product within the customer base of savings account or credit cards would also require testing to understand the value of both the Bank and its customer accurately. Traditional testing by the direct marketers has involved split groups, like an apple to apple, to compare customers' reaction to different offers. As the levels of an attribute increases, the bank needs a much larger number of test groups to establish its customers' value preferences due to the change in rates. Therefore, with changing times, the traditional process of testing has become cumbersome. This also turns out to be a gigantic task for the bank and there should exist a scientific method to reduce the test size while gaining the same amount of information. We propose an objective method of testing offers using an experimental design approach. We provide significant insights to the design of Banks' offers. We conclude that incremental lifts in response rates are much higher against lower interest rates for home loans and lower late fees for credit cards.
Speaker: Dinabandhu Bag, Associate Professor, National Institute of Technology, India
[ Top of this page ] [ Agenda overview ]
2:35pm-3:10pm
Room: Grand Ballroom
Keynote
Building Watson – An Overview of the DeepQA Project
Computer systems that can directly and accurately answer peoples’ questions over a broad domain of human knowledge have been envisioned by scientists and writers since the advent of computers themselves. Open domain question answering holds tremendous promise for facilitating informed decision making over vast volumes of natural language content. Applications in business intelligence, healthcare, customer support, enterprise knowledge management, social computing, science and government could all benefit from computer systems capable of deeper language understanding. The DeepQA project is aimed at exploring how advancing and integrating Natural Language Processing (NLP), Information Retrieval (IR), Machine Learning (ML), Knowledge Representation and Reasoning (KR&R) and massively parallel computation can greatly advance the science and application of automatic Question Answering. An exciting proof-point in this challenge was developing a computer system that could successfully compete against top human players at the Jeopardy! quiz show (www.jeopardy.com).
Attaining champion-level performance at Jeopardy! requires a computer to rapidly and accurately answer rich open-domain questions, and to predict its own performance on any given question. The system must deliver high degrees of precision and confidence over a very broad range of knowledge and natural language content with a 3-second response time. To do this, the DeepQA team advanced a broad array of NLP techniques to find, generate, evidence and analyze many competing hypotheses over large volumes of natural language content to build Watson (www.ibmwatson.com). An important contributor to Watson’s success is its ability to automatically learn and combine accurate confidences across a wide array of algorithms and over different dimensions of evidence. Watson produced accurate confidences to know when to “buzz in” against its competitors and how much to bet. High precision and accurate confidence computations are critical for real business settings where helping users focus on the right content sooner and with greater confidence can make all the difference. The need for speed and high precision demands a massively parallel computing platform capable of generating, evaluating and combing 1000’s of hypotheses and their associated evidence. In this talk, the audience will be introduced to the Jeopardy! Challenge, explain how Watson was built on DeepQA to ultimately defeat the two most celebrated human Jeopardy Champions of all time and will discuss applications of the Watson technology beyond in areas such as healthcare.
[ Top of this page ] [ Agenda overview ]
3:10pm-3:30pm
Room: Grand Ballroom
Diamond Sponsor Presentation
The Analytical Revolution
The algorithms at the heart of predictive analytics have been around for years - in some cases for decades. But now, as we see predictive analytics move to the mainstream and become a competitive necessity for organizations in all industries, the most crucial challenges are to ensure that results can be delivered to where they can make a direct impact on outcomes and business performance, and that the application of analytics can be scaled to the most demanding enterprise requirements.
This session will look at the obstacles to successfully applying analysis at the enterprise level, and how today's approaches and technologies can enable the true "industrialization" of predictive analytics.
Speaker: Colin Shearer, WW Industry Solutions Leader, IBM
[ Top of this page ] [ Agenda overview ]
3:30pm-3:40pm
Room: Grand Ballroom
Industry Trends: 2011 Data Miner Survey Results: Highlights
Do you want to know the views, actions, and opinions of the data mining community? Each year, Rexer Analytics conducts a global survey of data miners to find out. This year at PAW we unveil the results of our 5th Annual Data Miner Survey.
This session will present the research highlights, such as:
- Demand for data mining
- Open-source data mining software: usage trends
- Data visualization
- Text mining trends
- Measurement of analytic project performance/success
Speaker: Karl Rexer, Ph.D., President, Rexer Analytics
[ Top of this page ] [ Agenda overview ]
Break / Exhibits
4:15pm-5:00pm
Room: Gramercy B
Track 1: Crowdsourcing Data Mining; Healthcare
Case Study: Kaggle
Predictive Modeling Competitions and the Heritage Health Prize
The biggest news in predictive analytics in 2011 is the launch of the $3 million predictive modeling prize by the Health Provider Network (HPN). This is the biggest data mining competition ever and hopes to show the world the power of predictive modeling when applied to healthcare. In an age of big and complex data, competitions appear to be the best way to get the most out of a predictive modeling problem. Whereas a single data scientist can do so well on a problem, a competition brings in fresh eyes and new ideas, which allows companies and researchers to reach the limit of what's possible. Kaggle has hosted competitions that have improved the state of the art in HIV research, chess ratings and motorway travel time forecasting. This session will cover the Heritage Health Prize as well as some other competition case studies.
Speaker: Anthony Goldbloom, Chief Executive Officer, Kaggle
[ Top of this page ] [ Agenda overview ]
Room: Gramercy A
Lab Session: Live Topical Demo
Making Predictive Models Count
By now, the value of marketing analytics and predictive models is widely recognized among the business community for completing tasks ranging from optimizing placement of online advertisements to imputing people's movie preferences.
What hasn't been discussed much is that predictive models require valuable resources to build, often have modest-at-best accuracy and uncertain business value. Without proper attention to these issues, analytics teams run the risk of over-promising and under-delivering.
In this case study, we will outline the key steps involved in developing predictive models that deliver true business value. This requires understanding not just the predictive performance but also error rates, error costs, investment costs and ROI. We'll examine a case study of a predictive model used to target selected customers for more expensive marketing communications.
[ Top of this page ] [ Agenda overview ]
Room: Murray Hill A
Track 3: Uplift Model
True-Lift Modeling: Mining for the Most Truly Responsive Customers & Prospects
Stop spending direct marketing dollars on customers who would purchase anyway!
True-lift modeling can identify:
- which customers will purchase without receiving a marketing contact
- which customers need a direct marketing nudge to make a purchase
- which customers have a negative reaction to marketing (and purchase less if contacted)
This discussion will describe:
- the basic requirements needed to succeed with true-lift modeling
- scenarios where this modeling method is most applicable
- the pros and cons of various approaches to true-lift modeling
Speaker: Kathleen Kane, Principal Decision Scientist, Fidelity Investments
[ Top of this page ] [ Agenda overview ]
Break / Exhibits
5:25pm-5:45pm
Room: Gramercy B
Track 1: Risk Modelling
Rethinking Analytic Solutions: Modeling Catastrophic Risk
Expanding capabilities for in-database analytics, coupled with growing penetration of analytic appliances into the enterprise data warehousing landscape, pose new opportunities and challenges for practitioners, as well as implications for synergies between analytics and IT. As business enterprises accelerate their move towards advanced analytics for competitive advantage, the analytic environment is evolving to become more responsive to the growing needs for business information and solution implementation. Drawing from a case study in catastrophic risk modelling for the insurance industry, we present an approach to analytical solutions architecture designed to fully leverage in-database analytics technology for greater business value.
Speaker: Shajy Mathai, Executive Vice President & Chief Technology Officer, OpenRisk
[ Top of this page ] [ Agenda overview ]
Room: Gramercy A
Track 2: Market Mix Modeling
Case Study
Methods for Market Mix Modeling Explained
Marketing Mix Models (aka Marketing Effectiveness, Marketing ROI Models) are relied on to measure and optimize an organization's marketing spend. The quality of these models is extremely important for measuring the ROI, especially for companies spending millions or hundreds of millions of dollars on media. These models are rooted in time series Econometrics and VARMAX models can be leveraged to produce a variety of commonly used models: ARIMAX, Autoregression Models, Vector Autoregressions (VAR), Vector Error Correction (VEC) Models along with Granger Causality, Unit Root and Cointegration Tests – some Chow tests, etc. Traditional approaches produce the requisite graphics (impulse response functions, etc.), but have lacked the ability to easily simulate outcomes. Marketers and Senior Management typically request various model-based scenarios (e.g., how much television spending for a given amount of radio and digital spend?). This case study will endeavor to more easily link simulations, visualization, and econometric methods using simulated data.
Speaker: Donald Cozine, Director of Statistical Analysis, ANALYTICi
[ Top of this page ] [ Agenda overview ]
Room: Regent
Track 3: Survey Analysis & Churn Risk Detection
Case Study: PayPal
Identifying Customers Who Expressed Intend-to-Churn or Defect from Large Number of Surveyed Verbatim
How can customers’ intend-to-churn or defect be detected without having to read the large number of customer verbatim feedback? In this case study, we’ll show you how we use a combination of human-classified verbatim (into at-risk and not-at-risk) and query-based text search to classify a set of ’at-risk’ verbatim set. We then use it as a training set for a supervised learning algorithm to predict and classify a large set of customer verbatim into at-risk and non-at-risks so that actions can be taken to prevent churn and learn from their feedback.
Speaker: Han Sheong Lai, Director of Operational Excellence & VOC, PayPal
[ Top of this page ] [ Agenda overview ]
5:50pm-6:10pm
Room: Gramercy A
Track 2: Market Mix Modeling
Case Study: Overstock.com
The RecLab $1 Million Prize on Overstock.com: Driving Innovation with Live
Data in the Cloud
The RecLab Prize on Overstock.com is a program like the Netflix prize and the Heritage Health Prize, designed to spur innovation in retail personalization by a large community of motivated researchers. We go beyond offline data sets, however, and allow semifinalists and finalists to run their algorithms against live traffic on Overstock.com. This allows us to choose winners not solely based on mathematical notions of algorithmic quality, but on how real shoppers interact with them. The challenge with any contest using real data, whether historical data sets of live traffic, is that personal information about real shoppers must not be released. We solve this problem with a new approach that brings researchers’ algorithms into our secure cloud to run against large sets of live shopping data, rather than exporting anonymized data to the world at large.
Online product recommendations are among the shopping tools most widely used by consumers who need to easily find relevant and enticing products from the millions available online. Overstock.com has worked with this Prize's facilitating vendor since 2009 to present shoppers with dynamic recommendations that grow smarter over time and accurately reflect more than 60 different ways that people shop on the site (by price, by brand, by category). The RecLab Prize on Overstock.com is intended to spur a new generation of research in this area using real data sets at massive scale.
Speaker: Sean Pfister, Senior Market Analyst, RichRelevance®
[ Top of this page ] [ Agenda overview ]
Room: Regent
Track 3: Financial Indicators from Social Media
Social Media Analysis for Market Prediction: Collective Mood States and the Wisdom of Crowds
Hundreds of millions of individuals are now connected to online social networking services which are becoming an increasingly important medium for the exchange of personal as well as public information. In fact, more than 150 million tweets, each consisting of a 140 character update by an individual user, are posted on Twitter on a daily basis. Facebook now claims more than 500 million users worldwide who generate personal status updates and other online content by the millions every day. The streams of user-generated information, referred to as a "social media feeds", may contain valuable, real-time information on the public's opinions, activities, and mood states. In fact, advances in natural language processing and sentiment tracking algorithms now enable us to leverage computational approaches to efficiently mine the wealth of information in these social media feeds.
In this session I will provide an introduction to the basic principles of online social networking environments and the resulting social media feeds that are generated by their millions of users. Subsequently I will provide an overview of existing text analysis approaches that have been used to extract indicators of social opinion and sentiment from these feeds. A number of recent results demonstrate the value of such analytics to gauge among many others "national happiness" and consumer sentiment towards particular brands and products. In some cases it has even been demonstrated that social media feeds may contain predictive information with regards to a variety of socio-cultural indicators, such as box office receipts, product adoption and even the stock markets. In the latter half of my presentation I will particularly outline our own research on the subject of stock market prediction. My team and I have analyzed large- scale Twitter data to yield accurate measurements of the public's mood state which in turn have been shown to contain predictive information with regards to the Dow Jones Industrial Average. In addition we have performed an analysis of longitudinal changes in individual user sentiment over hundreds of thousands of Twitter users to study the effects of social networking relations to evolving user mood states.
[ Top of this page ] [ Agenda overview ]
Reception / Exhibits
Sponsored by:
Local Group Meeting
NYC Predictive Analytics
Bayesians, Frequentists, and Big Data: Musings on Statistics in the 21st Century
This talk will touch upon topics in data analysis, statistics, and computing relating to modern massive data challenges. How do classical theories in statistical inference and asymptotics translate into statistical practice in the modern world? What role should complex Bayesian procedures and other cutting-edge methodologies have in the data analyst toolkit? Computationally, how can we manage the data deluge and how is statistical software evolving? What are the implications for the data analyst? What are the dangers posed by addressing these very questions? I'll suggest possible answers to some of these questions, and hope to spur further debate by posing others.
NYC Predictive Analytics is a non-profit professional group that meets monthly to discuss diverse topics in predictive analytics and applied machine learning. We are a group 1000+ members strong comprised of analysts, computer scientists, engineers, executives, entrepreneurs and students with a deep interest in these fields & related technologies.
Click here for more information about this Local Meeting
Speaker: John W. Emerson, Associate Professor, Department of Statistics, Yale University
[ Top of this page ] [ Agenda overview ]
Thursday, October 20, 2011
Exhibit Hall Open
Registration & Breakfast
9:00am-9:45am
Room: Gramercy
Expert Panel:
Wise Enterprise: Best Practices for Managing Predictive Analytics
Your company is trigger-happy for predictive analytics, and there's plenty of excitement, momentum and public case studies fueling the flames. Are you destined for success or disappointment? Is it a sure-fire win to gain buy-in for a promising analytics initiative, equip your most talented practitioners with a leading solution, and pull the trigger?
This panel of leading experts will address the holistic view. What are the most poignant and telling failures in the repertoire, and where is the remedy? Beyond the management of individual analytics projects, what enterprise-wide communication processes and other best processes provide best security against project pitfalls? Stay tuned for big answers to these big questions.
Speaker: Wayne Thompson, Analytics Product Manager, SAS
Speaker: Colin Shearer, WW Industry Solutions Leader, IBM
Speaker: Dean Abbott, President, Abbott Analytics
[ Top of this page ] [ Agenda overview ]
9:45am-10:05am
Room: Gramercy
Platinum Sponsor Presentation
Forensic Analytics and Continuous Monitoring: Leveraging Predictive Analytics
Traditional approaches to forensic analytics often involve historical reviews of electronic data requiring after-the-fact data collection. While this investigative method of review is likely to remain both relevant and useful, consider this…what if the data itself could automatically and continuously push risk-based indicators to analysts, decision makers and investigators? In this session we will discuss forensic analytics, the push vs. pull approach to forensic data analysis and the incorporation of advanced methodologies including predictive analysis, social network analysis, semantic modeling, geospatial analysis, text mining, anomaly detection; and the fusion of disparate, as well as 3 party data sources for this purpose.
Speaker: Jennifer Boyce, Senior Manager, Deloitte Financial Advisory Services
[ Top of this page ] [ Agenda overview ]
10:05am-10:25am
Lightning Round of 2-minute Vendor Presentations
[ Top of this page ] [ Agenda overview ]
Break / Exhibits
10:55am-11:15am
Room: Gramercy B
Track 1: Integrating Tools
Case Study: Travelers Insurance
Creating more Analytical Bandwidth with R
Connecting to additional software tools from R creates additional analytic bandwidth. Cross-tool communication takes the strengths from multiple tools and empowers the analyst with tools where the sum is more than the parts. More bandwidth means more analytic productivity! Combining tools provides a better match of analyst skill levels to tools and takes advantage of smarter automation to enable/empower the analyst to concentrate on creating more value.
My motivation - to help two groups of analysts here at Travelers, 1) experienced modelers who wish to explore sophisticated modeling tools in R, and 2) new grads who come to us with strong R backgrounds but limited skills with commercial enterprise analytics tools. In order to be productive, both groups need to access company data and high-performance software which is located on central UNIX servers (AIX and Linux) and access to company data repositories such as downstream databases, they need to manipulate and summarize that data, then bring it down to R on the desktop. To date, this is a very manual time-consuming process. Recent versions of enterprise analytics tools have added capabilities to access R. Taking another approach, an R package helps communicate with the enterprise commercial tools. The key component that makes this possible is other R packages that provide communication via MS Windows COM facilities to access the automation objects and integration technologies on the desktop.
Speaker: Matthew Flynn, Director of Claim Research, Travelers Insurance
[ Top of this page ] [ Agenda overview ]
Room: Gramercy A
Track 2: Fraud Detection
Case Study: U.S. Postal Service Office of Inspection General
Fighting the Good Fraud Fight
Fraud is a costly problem for many businesses, and the efforts required to protect against it further compound the price. We will discuss the cultural and business hazards of addressing versus ignoring fraud, as well as the enormous ROI possible when adaptable quantitative tools are used to detect ever-changing anomalous behavior. Case studies from some of our consulting engagements will highlight lessons learned about what makes a potential fraud detection project ripe for success.
Speaker: Antonia de Medinaceli, Senior Business Analyst, Elder Research, Inc.
[ Top of this page ] [ Agenda overview ]
Room: Murray A
Track 3: Mobile Analytics
Case Study: Citibank
Predictive Analytics in Customer Digital Payments
Digital universe information is expected to grow from current level of 1.2 zetabytes to a whopping 35 zetabytes by 2020. Absence of digital financial ecosystem within banking and financial services makes customers to avail several nascent retail services outside the main stream banking. It is estimated the global remote payment to increase nearly threefold within next five years.
Recent regulatory changes have led the customers to not only seek for simplified solutions, but also clarity of information about what the business can or cannot do for the customer. Consumers are becoming more and more innovative online through crowd-sourcing while using friends and families and their digital contact to get decisions in real-time. While large-scale banks used to garner information, small start-ups utilize crowd-sourcing to not only operate with innovative and nice anecdote business models while analytics is done by means of processing information using online resources.
This session will highlight and discuss the upcoming collaborative business models to use analytics, online and digital media with "platform as a service".
Speaker: Ramendra Sahoo, Senior Vice President of Risk Technology, Citibank
[ Top of this page ] [ Agenda overview ]
11:20am-11:40am
Room: Gramercy B
Track 1: Mobile Analytics & Search
Case Study: Microsoft
Mobile Search Advertising & the Importance of Data in Understanding Customer Intent
This session covers what the mobile paid search marketplace has to offer marketers and understand how analytics helps search engines understand their customers.
You will learn:
- Why you should be leveraging the mobile search advertising space to reach out to potential customers.
- Opportunity to understand the intentions of users in mobile search and how it is different than historical PC search.
- Discover technologies and processes that enable In-Database Analytics to help clients achieve their goals.
Speaker: Will Dannenberg, adCenter R&D Group Program Manager, Microsoft
Speaker: Mukund Raghunath, Geography Head, Mu Sigma
[ Top of this page ] [ Agenda overview ]
Lab Session: Live Topical Demo
Social Network Analytics – Using Influence to drive a better Customer Experience
Social networks have become a central feature of our online lives, both as consumers and enterprises. From Farmville to Linked-in to Groupon, many new businesses revolve around knowing your friends and leveraging their collective knowledge, behaviors, opinions and buying power. Indeed, for any data-driven enterprise seeking to provide a personalized and relevant customer experience, it is now no longer just web analytics - effective real-time social network analytics can reap significant rewards. The heart of social network analytics revolves around solving graph problems on large volumes of data at scale and with high performance. This talk will describe real-world use-cases of social network analytics and how they are being accomplished by customers of the Vertica Analytic Platform.
Speaker: Shilpa Lawande, Vice President of Engineering, Vertica, An HP Company
[ Top of this page ] [ Agenda overview ]
Room: Murray A
Track 3: Fraud Detection
Case Study: A Major Regional Bank
Business Case Development for Credit and Debit Card Fraud Re-Scoring Models
This is a method for determining the business cases for deploying predictive analytics for retail banking fraud detection, with a specific focus on credit and debit card fraud. An actual case study using the receiver operation characteristic curve (ROC) of a credit card re-scoring model will be provided. The contingency table, ROC curve, and payoff matrices for the re-scoring model form the basis of the method for computing an NPV and identifying the optimal score thresholds for fraud detection.
Speaker: Kurt Gutzmann, Managing Director, GCX Advanced Analytics LLC
[ Top of this page ] [ Agenda overview ]
11:45am-12:30pm
Room: Grand Ballroom
Special Plenary Session – Case Studies: Anheuser-Busch,
the SSA, Netflix
Data Mining Lessons Learned – Technical & Business - from Applied Projects
In the recounting of analytics projects, my favorite part is "the reveal": where the idea that turned things around is disclosed. Often disarmingly simple (in retrospect) it is virtually always preceded by waves of failure. Yet failure, or at least an environment shockingly tolerant of it, may be essential to the emergence of such breakthroughs.
I will tell tales of some favorite "reveals" that led to technical successes. But, a true win must also be a business success. This requires dealing well with idiosyncratic carbon-based life forms. So we'll also discuss the (painfully acquired) lessons in the parallel universe of business.
Speaker: John Elder, CEO & Founder, Elder Research, Inc.
[ Top of this page ] [ Agenda overview ]
Birds of a Feather Lunch / Exhibits
1:30pm-2:15pm
Room: Grand Ballroom
Keynote
Every Day Analytics: Making Leading Edge Pervasive
As hype increases about the value and quantities of data, the world is beginning to understand the potential of analytics. One doesn 't need to be a statistician to get excited – these conversations are no longer just about algorithms. Consumers are directly impacted in their daily lives, as even their running shoes can give them analytical feedback on their workout. In this session, Thomas H. Davenport of the International Institute for Analytics will discuss every day innovations in analytics and how, as we learn to harness insight from data, our world may change.
[ Top of this page ] [ Agenda overview ]
Break / Exhibits
3:00pm-3:45pm
Room: Gramercy B
Lab Session: Live Topical Demo
The Analytic Advantage: How Banks Can Profit from Customer Insight
Learn how banks are using predictive analytics to improve marketing ROI, boost revenue and grow customers. This session will show you how to gain a deeper understanding of your customers so you can gain more value and bigger returns from every customer interaction. Discover how predictive analytics helps financial service organizations:
- Anticipate what customers want and will do next
- Attract higher value clients who purchase more and stay longer
- Reduce marketing costs and increase marketing ROI
[ Top of this page ] [ Agenda overview ]
Room: Gramercy A
Track 2: Law Enforcement
Case Study: CMPD
Law Enforcement Analytics Solution Helps Identify Potential Criminal Activity
In this session, Robert Broughton, Crime Analyst at Charlotte-Mecklenburg Police Department (CMPD), will discuss how his department used business intelligence to analyze past crime trends and patterns; better determine how resources should be deployed to reduce crime; predict the likelihood of particular crimes based on geography and other factors; and monitor crime activity in real time. CMPD built its Law Enforcement Analytics (LEA) solution to track these and other factors and foster a more effective police force. Robert will highlight the benefits of predictive modeling and provide additional examples of how this solution has benefitted CMPD.
Speaker: Robert Broughton, Crime Analyst, Charlotte-Mecklenburg Police Department
[ Top of this page ] [ Agenda overview ]
Room: Murray A
Track 3: Retention with Churn Modeling
Speaker: GE Capital
Using Segmentation & Predictive Analytics to Reduce Customer Attrition
Attendees will learn more about the following:
- Understanding the different types of attrition and how to measure it through business intelligence.
- What is the financial impact of attrition and how does it affect your business.
- How to develop a Retention Framework using analytical tools such as segmentation and modeling to help combat attrition.
- Key topics that will be covered are how to leverage analytical tools and business insights to develop proactive marketing and contact management strategies that will increase ROI and optimize marketing investment.
- Learn more about how to successfully implement segmentation and predictive models through case studies and recommended best practices.
This session will appeal to a wide audience including marketing managers, analytics, finance and senior management across a variety of different verticals.
Speaker: David Liebskind, Retail Analytics Leader, GE Capital
[ Top of this page ] [ Agenda overview ]
3:25pm-3:45pm
Room: Gramercy A
Track 2: Law: Forecasting for a Legal Defense
Case Study: A Significant Legal Case
Major Litigation Strongly Supported by Fuzzy Reasoning
A major regional utility had contracted with the world's #1 software services provider for the development, delivery, and installation new enterprise billing and tracking system. Failure to deliver resulted in litigation. Utility needed expert witness to develop case study materials. I was contracted to do so and used industry standard stats to develop a fuzzy logic simulation model that confirmed the probability of meeting contracted delivery schedule was less than 0.001.
Speaker: DL vonKleeck, Founder vK Systems, Inc.
[ Top of this page ] [ Agenda overview ]
Break / Exhibits
4:30pm-4:50pm
Room: Gramercy B
Track 1: Retention with Churn Modeling
Case Study: PayChex, Inc.
Combat Client Churn with Predictive Analytics
In economic conditions such as this, it is critical for businesses to have a stronghold on their client retention efforts. Historically, it has been shown that businesses that excel in this arena are often better positioned for long-term success and possess a competitive advantage. To optimize the value of retained customers it's essential to understand which clients are a fit for retention campaigns so that the loss of time and resources is minimized. In this session, we will review how Paychex leveraged two existing models, Paychex Attrition Model and a custom built Lifetime Value Model, to create a Retention Tracking System (RTS). Since being deployed across the entire branch network, the Retention Tracking System has become an invaluable resource as offices nation-wide strive to meet, and exceed, their retention goals.
Speaker: Frank Fiorille, Director of Enterprise Risk Management, Paychex, Inc.
Speaker: Erika McBride, Manager of Modeling & Risk Review, Paychex, Inc.
[ Top of this page ] [ Agenda overview ]
Track 2: Social Data
Case Study: Match.com
Search and Social: Intelligent Matching at Match.com
What we commonly think of as search is generally one-way: when you search for a book on Amazon or a restaurant on Google, the engine needs to find something the searcher likes. On Match.com, the searcher, and the person who is found through the search, need to both like each other in order to have success: this creates the need for highly sophisticated search processes. In this session Amarnath will discuss how Match.com is considered one of the most popular consumer uses of predictive analytics. How the more a user is on the site, the better the matching will be. It's incorporation of static information, user behavior, and community behavior is, we believe, unparalleled in the industry.
Speaker: Amarnath Thombre, Vice President of Strategy & Analytics, Match.com
[ Top of this page ] [ Agenda overview ]
4:55pm-5:15pm
Room: Gramercy B
Track 1: Supply Chain Management
Case Study: United Group Holdings
Value Proposition Segmentation (VPS) Method
Customers, in today's competitive market, effective management of customer relations lies in the ability to optimize the dual creation of firm (shareholder) and customer value. Accordingly, the challenge for many companies is to be able to understand and differentiate heterogeneous customers by their needs to deliver the wining value proposition profitably. This session will show how our proposed VPS model addresses the basic managerial concern of balancing relationships from both the seller's (customer loyalty) and the buyer's (customer benefit), by considering both the service provider's financial performance (i.e. customer value to the firm) and the value customers receive from the provider's offerings (i.e. customer benefit).
Speaker: Amjad Zaim, Ph.D.,CEO & Co-Founder, Cognitro Analytics
[ Top of this page ] [ Agenda overview ]
Room: Gramercy A
Track 2: Social Data: Advanced Methods
Network Maps for End Users: Collect, Analyze, Visualize and Communicate Network Insights with Zero Coding
Networks are everywhere except the end user desktop. NodeXL, the free and open network overview, discovery and exploration add-in for the popular and familiar Excel (2007/2010) spreadsheet allows users who are comfortable making pie charts to now make useful network visualizations. Developed and released by the Social Media Research Foundation, NodeXL uses Excel as a framework, providing a GUI network browser (a "web browser"?) that novices can use quickly and experts can use to generate sophisticated results. Data importers provide access to a range of social media network data sources like Twitter, flickr, YouTube, Facebook, email, the WWW, and more through standard file formats (CSV, GraphML, Matrix). Simple-to-use tools can automatically analyze, visualize and highlight insights in complex network graphs. Using NodeXL, researchers have been collecting a wide range of network data sets from various social media services. These images reveal a range of common social formations in social media and point to people who occupy strategic locations in these graphs.
Speaker: Marc A. Smith, Chief Social Scientist, Connected Action Consulting Group
[ Top of this page ] [ Agenda overview ]
4:30pm-5:15pm
Room: Murray A
Track 3: Black Box Trading
Case Study: Rebellion Research
Humans, Rules & Machine Learning: Three Prediction Paradigms
In this session, I will discuss three common ways to make predictions: relying on the human brain directly, designing software that implements rules developed by human experts, and utilizing machine learning based prediction methods. I will analyze the advantages and disadvantages of each of these three approaches, using the stock market as a case study of an area where all three methods are actively applied, but where prediction is especially challenging.
[ Top of this page ] [ Agenda overview ]
Workshop sponsored by:
Friday, October 21, 2011
Full-day Workshop
Hands-On Introduction to Text Analytics with IBM SPSS
Click here for the detailed workshop description
- Workshop starts at 9:00am
- First AM Break from 10:00 - 10:15
- Second AM Break from 11:15 - 11:30
- Lunch from 12:30 - 1:15pm
- First PM Break: 2:00 - 2:15
- Second PM Break: 3:15 - 3:30
- Workshops ends at 4:30
Instructor: Tim Daciuk, Business Development Manager, Advanced Analytics, IBM
[ Top of this page ] [ Agenda overview ]
Full-day Workshop
The Best and the Worst of Predictive Analytics: Predictive Modeling Methods and Common Data Mining Mistakes
Click here for the detailed workshop description
- Workshop starts at 9:00am
- First AM Break from 10:00 - 10:15
- Second AM Break from 11:15 - 11:30
- Lunch from 12:30 - 1:15pm
- First PM Break: 2:00 - 2:15
- Second PM Break: 3:15 - 3:30
- Workshops ends at 4:30
Instructor: John Elder, CEO & Founder, Elder Research, Inc.
[ Top of this page ] [ Agenda overview ]
Full-day Workshop
Deploying User-Friendly Predictive Analytics: Delivering Results to Business Users with Interactive Applications
Click here for the detailed workshop description
- Workshop starts at 9:00am
- First AM Break from 10:00 - 10:15
- Second AM Break from 11:15 - 11:30
- Lunch from 12:30 - 1:15pm
- First PM Break: 2:00 - 2:15
- Second PM Break: 3:15 - 3:30
- Workshops ends at 4:30
Instructor: Jeff Mergler, Lead Statistical Applications Trainer, TIBCO Spotfire
[ Top of this page ] [ Agenda overview ]
*Rooms are subject to change