Email: Sam@PredictiveInsightsAI.com

Predictive artificial intelligence (AI) is a type of computer program that uses statistical analysis and machine learning to predict future events, behaviors, and patterns. Predictive AI is also known as predictive analytics.

Models continue to become more accurate, but for certain applications, the uncertainty of an estimate is as important as the prediction itself. By using prediction intervals, which create a range of values instead of a single number, this uncertainty can be measured using quantile regression. In insurance and finance where model risk translates to financial gain or loss, predictive analytics practitioners can reduce losses and increase profits by hedging against this risk.

Using gradient boosting, a widely-used machine learning method, two use cases are demonstrated on open-source and reproducible data:

  1. Add value to actuarial predictive models by creating prediction intervals;
  2. Audit data quality in a fully-scalable manner using outlier detection.

Introduction


Predictive analytics is the science of using statistics to predict the future. This processes often involves identifying a business problem, collecting relevant data, and then making predictions that help to improve or solve the problem. As advances in the field of machine learning produce better algorithms, the relative size of errors continues to decrease. But even the best models are random to some degree.

In real-world applications, a prediction about a future event will never completely match reality. Even with the most robust validation, we can never know for certain how well a given model will perform on unseen data. Data collected outside of the laboratory is often sparse for certain segments of the population. In health care, models are built based on thousands of diagnosis codes, and each patient may only have a few dozen codes in common with other patients. Credit unions and banks create models from purchase histories which contain varying line item keywords. Websites use clickstream data which contains millions of user inputs. In all of these examples, the data quality can vary dramatically from case-to-case. Intuitively, those predictions which are based on low-quality data will be less reliable than those with more data.

For financial applications where the prediction is a dollar amount, model risk translates into financial risk. In Modern Portfolio Theory (MPT), investment risk is measured by the standard deviation. This is the tendency of the asset to have a return that is above or below the average return. Portfolio managers need to know the risk level as well as the average return in order to maintain profitability.

In insurance, actuaries are the original scientists of risk. Traditionally, actuaries use parametric models which allow for not only the average to be estimated, but also other risk measures. The Value at Risk (VaR) is a terminological name for the pth quantile of the loss distribution. Usually this is just the 90th or 99th percentile. Two entities with the same average but different VaR are valued very differently.

The Actuarial Standards of Practice (ASOPs) describe the procedures an actuary should follow when performing actuarial services. ASOP No. 23 speaks to this issue of data quality and how the uncertainty behind results should be acknowledged.

Appropriate data that are accurate and complete may not be available. The actuary should use available data that, in the actuary’s professional judgment, allow the actuary to perform the desired analysis. However, if significant data limitations are known to the actuary, the actuary should disclose those limitations and their implications in accordance with section 4.1(b).

Much research has been done to create machine learning models which create more accurate averages, but less research focuses on measuring the uncertainty of those predictions. One of the most widely-applicable algorithms, the gradient boosting machine (GBM), has a method of estimating prediction error using quantile regression. This allows for predictions of not only the average, but also the range or uncertainty of the prediction. This helps to solve the problem of model risk: if an observation is predicted based on sparse data, then the prediction interval for this record will be wide; if the prediction is based on strong evidence, or in actuarial terminology has “higher credibility”, then the prediction interval will be narrower.

  • AI is the automation of decision-making, human-demanded practices which require cognitive effort to be performed like when a human brain’s neurons fire in a particular way to create an idea, make a selection, create a work of art, or write words on a page, etc.
  • Supervised learning is the process of predicting an unknown, labeled piece of information. This requires there to be already-labeled data. Which is what distinguishes it from AI… this is a smaller field of study.
  • Generative AI is when a computer synthetically generates information… which can be pixels that create a picture, sin-cosine waves which we hear as music, the pattern of words and lyrics of poetry, or patterns which are “art”.
  • Statistical learning is the field of study using summaries of data, or functions of random variables, to find patterns/truth. The statistics are short numerical indicators of data. Like the movie “Money Ball” where the baseball manager finds that Home Runs, and ERA aren’t the best statistics to value players from, and instead sparks a new era of baseball statistics driven by “Wins Above Replacement” and other stats.

`r code`
install.packages("ExamPAData")

ExamPAData: Collection of Datasets for Predictive Modeling

Downloaded by over 15,000 people. It provides a rich resource for anyone looking to improve their data analysis and predictive modeling skills.

Learn more about ExamPAData

Problem

Expensive study materials, a lack of expertise with new emerging tools, and regulatory changes

  • Digital Services Act
  • Digital Markets Act
  • DAC7
  • IFRS17
  • GAAP
  • GDPR

These are lagging behind AI

Existing Alternatives

Rely on shortcuts, coaching programs with vague promises, and impossible goals

Solutions

Affordable lessons, with flashcards, a digital app, audio podcasts, music, for you to self-study to improve the gamification of learning.

Unfair Advantage

Business license for predictive analytics in TX, and advisors with MBAs who build brand awareness. Support from AWS Activate program.

Early adoptors

Well, no one likes to be a ginea pig for untested technologies, that’s why my professional experience working in the AI industry with SaaS is valuable — I never provide code that hasn’t been thoroughly tested.

Predictive analytics is analagous to looking out the windshield. Descriptive analytics is looking in the rearview mirror.

My data science career began in Boston, Massachusetts, the same year that the Patriots mounted the largest comeback in Superbowl history to beat the Atlanta Falcons. Teaching for the actuarial exams.

What skills do you bring to the table?

Artificial Intelligence

AI won’t replace humans, but when we use AI we can replace manual tasks with click-and-run scripts.

Mathematics

The science of structure, order, and relation.

Computer Science

Technology and programming for computation and automation.

Business

The practice of making money by selling products.

Review of the New Chat GPT-4 Omni – First Impressions – May 2024

The new AI Chat GPT-4 Omni is more human-like and faster, but it is not AGI and has limitations in its abilities, particularly in legal recommendations, and it is important to consult with a qualified attorney for compliance with state laws and unique circumstances. 🤖 Chat GPT-4 Omni feels more human-like and is faster, but it’s not AGI, just a specialized language model. Open AI made a huge update to their chat, but and although it was already quite effective now it feels even more emotional like I’m texting a friend. This clip shows me using an API for chat GPt4(o) that is called FreedomGPT which is a project to make LLMs available to all. The Open AI that I use for chat GPT has model selection for 3.5, 4, and 4(O) though you max out the request limit. Using 3.5 for basic tasks because this is efficient. The response times have been delayed. You can ask it more open-ended questions and still get good answers. Although I would say it still suffers from the formatting issues. It will give you more information than you need it to as in this example, when I asked it to fill in a few sentences, it returned an entire document is still the same that we humans face which is how do you effectively select the information that’s important. Google has also released many AI products

Conversational AI with Documents

The use of machine learning and regression models can accurately predict bike sharing demand, and that AI can be trained to understand and use these models effectively. Chat with a document on AWS Bedrock to summarize a complex 18-page technical piece of writing. You can find this document from my earlier video about the project. 📝 The document examines data on bike sharing demand and explores modeling approaches to predict the number of bike rentals per hour, evaluating various regression models and identifying key factors influencing demand.

AI Foundation Models – June 2024

Are these reliable & Secure Generative AI? Amazon Bedrock is a new service for building Foundation models, offering easy scaling and customization for generative AI applications, with a focus on reliable and secure results through customization and guardrails. – Amazon Bedrock is a new service for building Foundation models, offering easy scaling and customization for generative AI applications. – AWS Bedrock allows customization and selection of state-of-the-art AI models for reliable and secure generative AI. AI used to generate a poem about traveling to Vermont, discussing data set creation and increasing response length. – Amazon’s Titan text model generates varied responses to short prompts, with the second poem being less rigid and following a specific structure. – Generative AI applications require consistent parameters for reliable and secure results. – AI system uses guardrails to filter inappropriate content and prevent users from overriding it by adding denied topics and avoiding misclassification. – Filter and block profanity and sensitive information in AI model responses, including alcohol-related words and personal identifiable information. Foundation models in AI can be accessed through Amazon’s platform for transfer learning, allowing for exploration of different parameters and tunings.