**99 scoring-rules questions.**

For an OLS model the mean squared error can be used to assess the fit of the trained model on the validation data.
What is the equivalent for a logistic regression model? Can I simply use the ...

I want to use a custom objective function with xgboost: 1 - (log(y) - log(p)) / (log(y) - log(q))y = true value, p = my probabilities, q = some other base ...

I have built a quasi-Poisson regression to predict sales of different products based on a number of explanatory variables, with an offset term for the number of days each product was on sale.
To ...

I recently completed a Kaggle competition in which roc auc score was used as per competition requirement. Before this project, I normally used f1 score as the metric to measure model performance. ...

This is a general question that was asked indirectly multiple times in here, but it lacks a single authoritative answer. It would be great to have a detailed answer to this for the reference.
...

I tried to solve the problem of comparing two estimators for soccer matches. The "estimators" are actually two punters trying two predict games results. The predicted value is between 0 and 1. The ...

I have recently been learning about proper scoring rules for probabilistic classifiers. Several threads on this website have made a point of emphasizing that accuracy is an improper scoring rule and ...

Several sources state that the score function for the likelihood of a cox model is
$$
\dfrac{\partial{}l(\beta)}{\partial\beta}=\Big(X_{i}\delta_i^T-\sum\limits_{i=1}^{n}\delta_i\dfrac{\sum\limits_{j\...

A proper scoring rule is a rule that is maximized by a 'true' model and it doesn't allow 'hedging' or gaming the system (deliberately reporting different results as is the true belief of the model to ...

When we search for a numerical way to find $\hat{\beta}$ in a GLM (say, a logistic regression), we could do a numerical optimization (minimization) of the negative log-likelihood.
But instead, we go ...

I need help with figuring out a proper scoring rule for the following task.
There are 11 possible outcomes.
There is a true probability distribution over these outcomes (known to me but not you). ...

I have seen different papers use different terms to express the scoring rules that they used to compare Bayesian models. Some of those terms are,
Log Predictive Density (Bayesian Data Analysis - by ...

The situation:
I have a logistic model that should predict a defect (1=defect, 0=no defect). My model uses 4 out of 14 parameters, which are significant for my ...

I would like to evaluate the performance of my machine learning model on a test set, but I only have access to the ground truth for a subset of the test set, say 60% of it. On the remaining 40%, the ...

I am building a machine learning model to attempt to predict the winner of a sports match based on historical statistics of the two teams.
My model (a neural network) appears to get about 70% ...

[I'm not a mathematician, so please forgive any misuse of terminology]
One way of understanding scoring rules is that they measure the 'distance' between the truth value of a statement, and the ...

My problem concerns an aptitude test containing a set of single choice items ($x_1 x_2 .. x_n$). For each item, the participant may have selected option 1 to 5. These choices are scored dichotomously (...

I'm using scikit package with RandomForestClassifier, trying to predict binary or multi-lable classifications.
I'm looking for a way to estimate the reliability of the model but really can't figure ...

I am trying to prepare a questionnaire of 30 items in order to assess the quality of published papers. Based on the importance of each criterion, I want to assign weights. Should I assign the weights ...

I need to create a scorecard where I can compare and contrast 3 different managers on one single measures/metric. The problem I'm having is to come up with a defend-able and fair scorecard logic/...

Background: There are some great questions/answers here on how to calibrate models which predict probabilities of an outcome happening. For example
Brier score, and its decomposition into resolution, ...

Dr Frank Harrell mentioned in his book and BIOS 330 course that
Accuracy score used to drive model building should be a continuous score that utilizes all the information in the data (e.g. Brier ...

I am running an analysis on the probability of loan default using logistic regression and random forests.
When I use logistic regression, the prediction is always all '1' (which means good loan). ...

I wonder what is the best statistical method to validate/score/evaluate a regression Neural Network used to predict probabilities (an example would be using a Regression NN to predict the probability ...

I've been studying (and applying) SVMs for some time now, mostly through kernlab in R.
...

Merkle & Steyvers (2013) write:
To formally define a proper scoring rule, let $f$ be a probabilistic
forecast of a Bernoulli trial $d$ with true success probability $p$.
Proper scoring ...

Suppose a populous nation has a high homicide rate and an understaffed police force. The police chief hires a statistician and together they decide to take a preventative approach by identifying ...

I'm testing several model that predict a binary variable. I've applied the logarithmic, Brier, and spherical scoring rules. Now, how do I determine what is a meaningful difference between the scores ...

Next week I will teach my students the score function and its variance (i.e.: fisher information).
I am looking for way(s) to illustrate these concepts so to help my students understand them (and not ...

I read about using scoring rules to evaluate the performance of predictive models. In the Wikipedia article about the Brier score, it is stated:
The Brier score is appropriate for binary and ...

The Spherical scoring rule is known to be strictly proper.
However, it is not very intuitive.
It's arccos, however, is the angle between the prediction ...

Suppose to have a bivariate variable $z_t=(x_t, y_t)$ indexed by $t=1,2, ..., T$. Suppose now that the two components have different support, i.e. in my specific problem $x_t \in \mathcal{S}$, where $\...

A very popular index to measure the stability of characteristics of a scorecard is defined by the following formula:
$$SI = \frac{1}{n} \sum \text{(actual in %)-(expected in %)} \cdot \log\left(\frac{...

I need to learn a Bayesian Network Structure from a dataset. I read the book titled "Learning Bayesian Networks" written Neapolitan and Richard but I have no clear idea.
According to the book from ...

Which are the widely used scoring functions for structural learning? More, specifically I am interested in scoring function that favours the random variables which have binary possible states.
For an ...

I am familiar with the Bayesian score, which is used to compare competing structures of a BN. However, I have difficulty in understanding how the Bayesian score formula below is derived.
1) What is ...

we work in such a cycle :
every month we create a new predictive model, based on data known about half a year before. Then we make a prediction on the newest data available, and that scoring goes to ...

In the LGD Model flow presented in the figure 4.13 in the book "Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT: Theory and Application" which is partially available on the web:
...

my situation is the following:
I have a matrix consists of purchase probabilities for different products per user.
Retrospectively I have another matrix consists of real-purchases. Now my task is to ...

my problem is the following: I have purchase probability estimations of different products. The model behind don't take care of any inter-correlations through these products. So my task is to re-...

I am using logistic regression to predict likelihood of an event occurring. Ultimately, these probabilities are put into a production environment, where we focus as much as possible on hitting our "...

- logistic
- forecasting
- classification
- predictive-models
- validation
- r
- machine-learning
- accuracy
- regression
- probability
- model-evaluation
- maximum-likelihood
- prediction
- error
- calibration
- neural-networks
- model-selection
- cox-model
- loss-functions
- python
- likelihood
- model-comparison
- bayesian-network
- hypothesis-testing
- distributions

- Why is 3-isopropyl-4-methylhexane named so and not 4-ethyl-3,5-dimethylhexane?
- Nigerian scammer openly say they are from Nigeria - but why?
- Remove Abs from Norms of Vectors
- Did close to 3,000 Puerto Ricans die in Hurricane Maria?
- Word for “someone who makes a little go a long way”
- Why is Visual Studio Community 2017 C++ standard is C++98?
- How to make double overline with less vertical displacement
- Timestamps of files copied to USB drive
- Protecting bicycle from sea/salty water
- Is movie Deadpool's fourth wall breaking a "mutant power"?
- Why doesn't the DNS resolve after a nameserver change, despite a recursive DNS lookup being successful?
- Normal distribtion (a little confused?)
- Why does a billiards ball stop when it hits another billiards ball head on?
- Proving this inequality without calculus
- 1984 - take the digits 1,9, 8 and 4 and make 123 - Part III
- Why auto-renew instead of canceling the subscription after free-trial?
- Which mathematical definitions should be formalised in Lean?
- Why is this 4th-note written as two 8th-notes tied together?
- What does "go blue" mean here?
- how to create this gamma symbol
- Riley Loves You
- Is it okay to mention we're citing an article only because a reviewer told us to?
- Did personal computers ever support 8" floppies?
- How long will it take to discover they live on a moon and not on a planet?