Day 1: Tuesday, October 13, 2015 |
REGISTRATION & BREAKFAST
8:30-8:45am • Room: Constitution BConference Chair Welcome Remarks
8:45-9:30am • Room: Constitution B
KEYNOTE
Harnessing the Data Revolution for International Development
Data is becoming cheaper and more available, and information technologies have become much more advanced and widespread. Yet USAID's mission to end extreme poverty requires working in challenging environments where data are often unreliable, unavailable, and where resources to turn those data into insights are limited. Dr. Peterson will discuss key changes in the data-for-development landscape, implications for achieving a world without extreme poverty, and examples of how USAID and others are applying new analytical tools and approaches to meet this goal.
Division Chief, Data and Analytics
U.S. Agency for International Development (USAID)
9:30-10:00am • Room: Constitution B
Diamond Sponsor Presentation
Solving the Previously Impossible
10:30-11:15am
Track 1
Insider Threat Panel
According to a recent survey report by Crowd Researchers Partners, 62 percent of security professionals said insider threats have become more frequent in the last 12 months, yet fewer than 50 percent of organizations have appropriate controls to prevent insider attacks. Furthermore, 62 percent of respondents said that insider attacks are far more difficult to detect and prevent than external attacks. Deterring insider threat is a serious concern for the U.S. government also, especially for the U.S. Intelligence Community (IC). Over the past century, the most damaging U.S. counterintelligence failures were perpetrated by a trusted insider with ulterior motives, according to the National Counterintelligence and Security Center. This Insider Threat Panel, with experts from NSA, NGA, DIA, and the National Insider Threat Task Force, will examine the role that predictive analytics plays in identifying potential insider threats.
Deputy Insider Threat Program Manager
National Geospatial-Intelligence Agency's (NGA) Security & Installations Directorate
10:30-11:35am • Room: Constitution B
Track 2
The Changing Face of Analytics at Federal Agencies: A View from the IRS
It has been nearly 20 years since the IRS first began testing the use of decision trees for classification models. Since then, other statistical and data mining techniques have been evaluated for classification and prediction, including k-Means, CART, logistic regression, kNN, Naïve Bayes, EM, LDA, and kernel methods. In addition, the use of link analysis and text mining continues to be valuable for pattern extraction. But what about adoption rates? What determines which of these techniques, and the technologies that support them, actually get implemented for operational use? How has that changed in the last 20 years, and will current practices still work in the next 20 years? This session explores such factors from the IRS perspective, including:
- Workforce skills and the role of data science
- Lab and research environments for deep testing
- Ability to fit new techniques and technologies into old systems
- Fitness of the business-IT relationship
Associate Director of Data Management
IRS Research, Analysis, and Statistics organization
10:30-11:35am
Mini Course Technical Track
Data Munging/Wrangling with R
One of the most time consuming steps in any analytics project is getting the data into a format that is suitable for analysis. "Data munging," or "data wrangling" as it is commonly called, is the process of cleaning, transforming and converting your data from its initial form into that which is necessary for modeling to occur. "R" is an open source statistical programming language commonly used for analyzing and visualizing data. Through R and its many available add-on packages you are provided a powerful means of slicing and dicing your data, no matter how dirty it may be, into the required format. In this mini-course, you will learn about the tools in R that make this process of data wrangling and munging more manageable.
Limited seating available. Priority will be given to government employees who sign up during the registration process.
[ Top of this page ] [ Agenda overview ]
11:20am-12:05pm
Track 1
From Wisdom to Insight: Driving Strategic Decision Making with Predictive Analytics
Tasked with providing the Army with access to visual, actionable analytics, COL Bobby Saxon led the redesign of the Enterprise Management Decision Support (EMDS) system to provide a strategic view of readiness for Army Leaders. The system brings together disparate data to provide one access point with valuable analytic displays. But this is only the beginning. With the vast amount of historical data the system brings together, the ability to forecast readiness and provide predictive models for future readiness is essential. During this session, COL Saxon will describe the ways he worked across the army to create EMDS, and the way forward for predictive analytics, as well as discussing the "art of the possible" that drives his vision for the program.
Chief, Force Management Enterprise Division, Force Management Directorate, Office of the Deputy Chief of Staff
Department of the Army
11:20am-12:05pm • Room: Constitution B
Track 2
Case Study: Predicting the Outcome of Employee Benefit Security Administration Retirement Plan Investigations
EBSA oversees a population of more than 700,000 retirement plans with a budget that allows for approximately 3,000 investigations per year. In this resource-constrained environment, focusing investigator effort on the most problematic plans is a top priority. This case study details a project aimed at predicting the outcome of EBSA investigations using observable plan characteristics such as plan assets, plan expenses, and the industry of the plan sponsor. The project team has faced and overcome many challenges including integrating the data, defining an outcome variable, back-testing the model, and incorporating the model results into EBSA's broader enforcement program.
11:20am-12:05pm
Mini Course Technical Track
Theory-Driven Modeling
Utilizing insights from the field of social network analysis, this mini-course will detail an innovative approach to assess relative importance and risk based on the interactions within a network. Using this approach, this course will demonstrate how to analyze a network of actors and rank their importance relative to a person-of-interest, which helps investigators focus their efforts on the most interesting associates. You will learn how this methodology can be used in scenarios as different as identifying the important actors in the 9/11 hijacker network, as well as tracing risk within a network of financial transactions.
Limited seating available. Priority will be given to government employees who sign up during the registration process.
[ Top of this page ] [ Agenda overview ]
12:05-1:00pm • Room: Constitution E
Lunch
1:00-1:45pm • Room: Constitution B
KEYNOTE
Follow the Money: Harnessing the Power of Advanced Analytics to
Combat Financially Motivated Crimes
Hear from the Director of the Financial Crimes Enforcement Network (FinCEN), a bureau of the U.S. Department of the Treasury, how they are harnessing advanced analytics to combat some of our nation's greatest threats, including terrorist organizations, foreign corruption, cyber threats, transnational criminal and drug trafficking organizations, and massive fraud schemes.
1:45-2:30pm • Room: Constitution B
PLENARY SESSION
Top Five Technical Tricks to Try when Trapped
There's no better source for tricks of the analytics trade than Dr. John Elder, the established industry leader renowned as an acclaimed training workshop instructor and author -- and well-known for his "Top 10 Data Mining Mistakes" and advanced methods like Target Shuffling. In this special plenary session, Dr. Elder, who is the CEO & Founder of Elder Research, North America's largest pure play consultancy in predictive analytics, will cover his Top Five methods for boosting your practice beyond barriers and gaining stronger results.
Also sign up for Dr. John Elder's one-day workshop, The Best and the Worst of Predictive Analytics: Predictive Modeling Methods and Common Data Mining Mistakes.
[ Top of this page ] [ Agenda overview ]
2:30-2:45pm • Room: Constitution E
BREAK
2:45-3:30pm
Track 1
Open Data in Government
Session description coming soon.
2:45-3:30pm • Room: Constitution B
Track 2
Analytics in Auditing: Putting the Passion Back in the Process!
Is analytics just a buzzword in your audit operation? How about these phrases: Continuous auditing? Risk based audits? Data driven decisions? Has anyone defined in practical terms just what these terms mean to the auditor?
This talk uses real business examples to explore the reality of transforming the planning and execution of auditing from tactical to strategic.
Former Deputy Assistant Inspector General for Analytics
US Postal Service Office of Inspector General
2:45-3:30pm
Mini Course Technical Track
Beyond SQL: Introduction to Graph Databases
Organizations have historically relied on relational databases to store data and SQL to query it. Despite their name, relational databases are not the best tool for quickly finding relationships among entities. This mini-course introduces the concept of graph databases, which have great advantages in flexibility and speed for analyzing how entities are connected. Areas in which graph databases have been employed include fraud detection, recommendations, logistics, and social networks. This course will include live demonstrations showing how government agencies can join industry leaders in adopting this technology.
Limited seating available. Priority will be given to government employees who sign up during the registration process.
[ Top of this page ] [ Agenda overview ]
3:35-4:20pm
Track 1
Building Sustainable Automation and Predictive Systems
Developing an analytics capability today is no longer difficult but building a sustainable approach takes concerted effort. Drawing on a 25 year career in developing decision support and predictive systems, this presentation will identify some key challenges and approaches to resolving them. Examples from a variety of experiences will be drawn upon including building the National Criminal Intelligence Fusion Capability (Fusion) within the Australian Crime Commission. Fusion brought together many different agencies with over a thousand datasets to share and analyze information and intelligence holdings to develop insights that support collaborative responses to serious and organized criminal threats. AUSTRAC is currently on a similar journey to create the technology and tools for data collection, ingestion, fusion, management and analytics to support the analysis and production of financial intelligence. Examples will also be drawn from the capital market and other domains. This session will describe the enablement of:
- Improved understanding and alerting of known risks.
- Discovery of previously unknown risks such as organized criminal activity.
- Building a sustainable whole-of-organisation approach to analytics.
- Measuring the benefits of productivity improvement through automation.
National Manager Innovation & Technology / Chief Information Officer (CIO)
AUSTRAC, and an Adjunct Professor at the University of Canberra
3:35-4:20pm • Room: Constitution B
Track 2
Evaluating the Alignment of Grant Resources and Community Need Using Bayesian Spatial Probit Models: A Case Study of Americorps
Many federal grants are intended to be allocated based on community need. Evaluating this premise is challenging because of complex spatial relationships among communities and grantees. Both communities in need and grantees tend to cluster geographically, which must be properly accounted for. Additionally, spatial algorithms can be computationally burdensome. We integrate spatial weights into a probit model using Bayesian MCMC methods via open source packages in R, and speed up data management and estimation using sparse matrices and parallel processing. Results show significant spatial dependency, and more accurate estimates of alignment between grant allocation and key indicators of community need.
Director of Data and Technical Analysis, Office of Human Services
Assistant Secretary for Planning and Evaluation, US Department of Health & Human Services
3:35-4:20pm
Mini Course Technical Track
Improving Predictive Accuracy with Ensembles
We don't always know beforehand which algorithm will work best for a given data set. Ensembles combine the results of multiple models into one usually more accurate than the best of its components. Ensembles can provide a critical boost to industrial challenges – from investment timing to drug discovery, and fraud detection to recommendation systems – where predictive accuracy is more vital than model interpretability. This mini-course will introduce you to the concept of ensembles and explain how you can use ensemble methods to improve the accuracy of your predictive models.
Limited seating available. Priority will be given to government employees who sign up during the registration process.
[ Top of this page ] [ Agenda overview ]
4:25-5:10pm
Track 1
How Does Policing Dosage Reduce Crime? A Predictive Policing Case Study
More police patrol is thought to have a bigger impact on crime, but the details of how policing dosage works are poorly understood. Using data on over 9 thousand recorded hours of policing dosage over 20 thousand dosage events in micro-scale crime hot spots, This workshop reviews scientific evidence for the effect of patrol dosage when paired with accurate crime predictions. Practical conclusions that can be implemented in an effective predictive policing strategy are presented.
4:25-5:10pm • Room: Constitution B
Track 2
The Application of Predictive Analytics to Juvenile Justice: The Florida Experience
The participants will hear about the findings from a study where predictive analytics was used to try and enhance the performance of the risk assessment instrument used by the Florida Department of Juvenile Justice. The Department uses the PACT risk assessment instrument, which has about a 60% accuracy rate in predicting future delinquent behavior. The algorithms developed using predictive analytics had an 82% accuracy rate.
4:25-5:10pm
Mini Course Technical Track
Analyzing Semi-Structured Data at Volume in the Cloud
The Cloud, Mobile and Web Applications are producing semi-structured data at an unprecedented rate. IT professionals continue to struggle capturing, transforming, and analyzing these complex data structures mixed with traditional relational style datasets using conventional massively parallel processing (MPP) and/or Hadoop infrastructures. Public cloud infrastructures such as Amazon and Azure provide almost unlimited resources and scalability to handle both structured and semi-structured data at Petabyte scale. These new capabilities coupled with traditional data management access methods such as SQL allow organizations and businesses new opportunities to leverage analytics at an unprecedented scale while greatly simplifying data pipeline architectures and providing an alternative to the "data lake."
This mini-course will cover these topics and will focus on analyzing structured and semi-structured data together using Snowflake, a commercially available cloud based platform, and standards-based SQL language to provide insights on large petabytes scale data sets.
Limited seating available. Priority will be given to government employees who sign up during the registration process.
[ Top of this page ] [ Agenda overview ]
5:10-6:10pm • Room: Constitution E
NETWORKING RECEPTION
Day 2: Wednesday, October 14, 2015 |
7:30-9:00am • Room: Constitution Foyer
REGISTRATION & BREAKFAST
9:00-9:15am • Room: Constitution B
Welcome Remarks
9:15-10:00am • Room: Constitution B
KEYNOTEReforming Government: Where is the Public Servant to Start?
The press, the taxpayer, the boss . . . everyone demands that government cut waste, stop fraud, and use its existing resources efficiently. This is where analytics come in, but where is the public servant to start?
- The IT shop already has a full time job and is not usually involved in policy making.
- The policy maker knows what problems need to be solved, but does not usually understand the data and systems capabilities.
- Vendors use IT vocabulary when addressing the policy maker or business school vocabulary when selling the data scientists.
How is a program manager, policy maker, or subject matter expert supposed to sift through the sales pitch to determine what exactly is needed to improve the operations of the agency and determine whether the current data and systems can be leveraged to accommodate the decisions?
As South Carolina's recently retired Secretary of Labor, Licensing and Regulation and subsequently Director of the SC Department of Health and Environmental Control, Catherine Templeton has advised on or implemented analytics projects for numerous state agencies and federal programs and has the scars to prove it. She will suggest who should be in the room when decisions are made and how to determine what analytics infrastructure you really need to solve the practical issue at hand - regardless of what the salesman tells you.
Former Director of the South Carolina Department of Health and Environmental Control and Former South Carolina Secretary of Labor
[ Top of this page ] [ Agenda overview ]
10:30-11:15am • Room: Constitution A
Track 1
Academia Panel
As organizations become increasingly aware of the potential value contained in their various data sets, the demand for highly skilled data scientists to exploit the data and unearth the value continues to grow at an amazing pace. This is true in both the private and public sectors. According to a McKinsey Global Institute report, by 2018, the U.S. alone could face a shortage of over 140,000 people with deep analytical skills. Academia has been trying to keep pace over the past few years to meet the demand with new undergraduate and graduate data science programs. In this Academia Panel, you will hear representatives from North Carolina State University, George Mason University, and Johns Hopkins University on how they are attempting to bridge the gap between the supply and demand of data science professionals, and how they are teaming with government to address its specific needs.
Larry Koskinen
Business Transformation Team Lead for New Core
Department of Housing and Urban Development
Panelists:
Assistant Professor of Analytics
Institute for Advanced Analytics, North Carolina State University
Professor and Chair, Systems Engineering & Operations Research
Volegenau School of Engineering, George Mason University
10:30-11:15am • Room: Constitution B
Track 2
IRS Preparer-Level Treatment Tests
In 2012, The Return Preparer Office (RPO) at the Internal Revenue Service (IRS) implemented a multi-year study to address compliance through paid return preparer-based treatments. The driver of the preparer-based approach is that treatment of a single preparer is likely to improve the compliance of many taxpayers, increasing the expected Return on Investment (ROI) of treatment resources. Because IRS has historically focused on taxpayer-level treatments, there is currently only a limited understanding of how preparer-based treatments effect change in preparer and client compliance. The goal of the multi-year study is to understand what treatments are effective on different segments of the non-compliant preparer population. In this paper we analyze the results from the first two years of the study. In the first year, a controlled test was used to assess the effectiveness of three different types of preparer-level treatments. One treatment was an educational visit to the preparer by a Revenue Agent to discuss issues found on returns that they had prepared. The second was a letter reminding the preparer of their due diligence requirements when preparing returns and warning that they and their clients might be subject to audit. The third treatment was a letter with the same message regarding due diligence, but also recommending that, as part of the continuing education required at that time, the preparer take a minimum of four hours of continuing education regarding the specific issue. In the second year of the study, an additional letter was tested that was purely educational. Since results from the first year were not yet available, the original three treatments were reemployed in the second year as well. All treatments were applied prior to the start of the filing season and the determination of the effectiveness of the treatments is based on returns filed in the subsequent processing year. Preparers in the first year are also followed an additional year to determine the extent of recidivism. The results from both the first and second year of the study are presented in this paper.
11:20am-12:05pm • Room: Constitution A
Track 1
Mining the Voter File
In this session, we will describe how the Los Angeles County Registrar-Recorder/County Clerk's office has used data science to clean and tidy its voter file, conduct exploratory data analysis, and ultimately develop a predictive analytic that could comb through 4.9 million voters and help the Department determine who is likely to work as a poll worker (if called by our Department) and who is not likely to work as a poll worker. This analytic is intended to assist the Department's poll worker recruitment efforts and cut down on the wasted time, energy, and resources.
Executive Assistant, Data Scientist
Los Angeles County Registrar Recorder/County Clerk's Office
11:20am-12:05pm • Room: Constitution B
Track 2
Location Analytics
Geospatial technology has dramatically improved the workflow and traditional methods used to prevent and detect fraud. The ability to visualize data has identified new trends and patterns which went undetected in their original tabular format. This presentation introduces how the United States Postal Service Office of Inspector General utilizes these tools and methods to aid Special Agents when combating fraud. Examples demonstrate how the implementation of GIS and the use of spatial analytics have improved efficiency and effectiveness, increased the identification of investigative leads, and identified potential problems areas for corrective action.
12:05-1:00pm • Room: Constitution E
LUNCH
1:00-1:45pm • Room: Constitution B
KEYNOTEActionable Data is the Key to Improving Population Health:
A CDC Perspective
Health data exists in many forms and is crucial to our efforts to improve public health. Data may come from population surveys, surveillance systems or clinical encounters and, with emerging information technology advances, can be obtained more quickly and in more volume than ever before. The key to improving health is really how that data is then used to inform public health interventions and policies. In order to maximize the potential for public health action, data must be timely, affordable and provide answers or insights into the health issues of interest to public health. Chesley Richards will share insights on the current state of health data and its use in public health, CDC health activities, and innovations in data access, storage and visualization.
1:45-2:30pm • Room: Constitution B
PLENARY SESSION
From Analytics to Impact
Hear from the first Chief Data Officer in the Office of Inspector General at the Department of Health and Human Services talk about how they are leveraging advanced analytics to fight fraud, waste, and abuse in HHS programs, including Medicare and Medicaid. Dr. Brzymialkiewicz ("Briz-mock-oh-wits") will share her observations on what it takes to achieve impact through actionable analytics.
Assistant Inspector General / Chief Data Officer
Office of Inspector General, US Department of Health and Human Services
2:30-3:15pm
BREAK
3:15-4:00pm • Room: Constitution A
Track 1
Data Strategy 2.0: Focus on the Roadmap and Implementation
As businesses struggle with the data flood, it is even more critical to focus on data as an asset that directly supports business imperatives. Organizations across most industries attempt to address data opportunities and data challenges to enhance business unit performance. Unfortunately, the results of these efforts fall far below expectations due to haphazard approaches. Poor organizational data management capabilities are the root cause of many of these failures. This presentation covers three lessons, which will help you establish realistic plans and expectations, and help demonstrate the value of such actions to internal and external decision makers.
3:15-4:00pm • Room: Constitution B
Track 2
Predictive Analytics for Population Health Management
Healthcare spending is a major fiscal issue, especially for state/local governments where it consumes over 30% of revenue, and costs rose 8% in 2012 versus 4% overall US healthcare spending increase. State employee health plans must balance benefits that attract talent with fiscal pressures. Here we chronicle how one state employee health plan reduced medical trend to zero using a wide range of population health management tactics. Once traditional care management tactics were implemented, however, it had to find new ways to enhance population management in order to maintain fiscal results. Learn new population health management predictive analytics approaches.
[ Top of this page ] [ Agenda overview ]
4:00-4:45pm • Room: Constitution A
Track 1
Un-Sensored: Analytics for Lower-Tech Facilities
Government agencies that own large facility portfolios lag their for-profit counterparts in using predictive analytics to make real-time or strategic decisions, primarily because this requires sophisticated sensor and newer facility management technologies only seen in urban and private enterprise. However, the National Park Service was able to use Bayesian network learning on its limited data to reverse-engineer the thought process of park managers to better predict facility priority scores for over 28,000 buildings. This holds important lessons in using probabilistic approaches to inform large capital planning and facility operations decisions, especially at lesser-sophisticated, older, and un-censored facility portfolios.
4:00-4:45pm • Room: Constitution B
Track 2
Moving Data Analytics from Proof to Production
Predictive analytic solutions in government systems is not new, nor particularly novel. We are aware of successful implementations built on methods ranging from traditional statistical regression to machine learning and data mining techniques. While practitioners develop successful proofs-of-concept, not all proofs become part of a production system. What are the characteristics of successful projects? and why do the others fail? Join Mr. Stephen Dennis, Innovation Director, Homeland Security Advanced Research Projects, and Mr. Dave Vennergrund, Director Data Analytics, Salient Federal Solutions, as they explore these questions and share best practices to help ensure your proof makes an impact in production.
[ Top of this page ] [ Agenda overview ]