Metaculus¶

class Metaculus(api_domain='www', username=None, password=None)[source]¶

The main class for interacting with Metaculus

Parameters

api_domain (Optional[str]) – A Metaculus subdomain (e.g., www, pandemic, finance)
username (Optional[str]) – A Metaculus username (deprecated)
password (Optional[str]) – The password for the given Metaculus username (deprecated)

get_question(id, name=None)[source]¶

Load a question from Metaculus

Parameters

id (int) – Question id (can be read off from URL)
name – Name to assign to this question (used in models)

Return type

MetaculusQuestion

get_questions(question_status='all', player_status='any', cat=None, pages=1, fail_silent=False, load_detail=True)[source]¶

Retrieve multiple questions from Metaculus API.

Parameters

question_status (Literal[‘all’, ‘upcoming’, ‘open’, ‘closed’, ‘resolved’, ‘discussion’]) – Question status
player_status (Literal[‘any’, ‘predicted’, ‘not-predicted’, ‘author’, ‘interested’, ‘private’]) – Player’s status on this question
cat (Optional[str]) – Category slug
pages (int) – Number of pages of questions to retrieve

Return type

List[MetaculusQuestion]

MetaculusQuestion¶

class MetaculusQuestion(id, metaculus, data, name=None)[source]¶

A forecasting question on Metaculus

Parameters

id (int) – Question id
metaculus (Any) – Metaculus API instance
data (Dict) – Question JSON retrieved from Metaculus API
name – Name to assign to question (used in models)

Variables

activity –
anon_prediction_count –
author –
author_name –
can_use_powers –
close_time – when the question closes
comment_count –
created_time – when the question was created
id – question id
is_continuous – is the question continuous or binary?
last_activity_time –
page_url – url for the question page on Metaculus
possibilities –
prediction_histogram – histogram of the current community prediction
prediction_timeseries – predictions on this question over time
publish_time – when the question was published
resolution –
resolve_time – when the question will resolve
status –
title –
type –
url –
votes –

static get_central_quantiles(df, percent_kept=0.95, side_cut_from='both')[source]¶

Get the values that bound the central (percent_kept) of the sample distribution, i.e., cutting the tails from these values will give you the central. If passed a dataframe with multiple variables, the bounds that encompass all variables will be returned.

Parameters

df (Union[DataFrame, Series, DeviceArray, ndarray]) – pandas dataframe of one or more column of samples
percent_kept (float) – percentage of sample distrubtion to keep
side_cut_from (str) – which side to cut tails from, either ‘both’,’lower’, or ‘upper’

Returns

lower and upper values of the central (percent_kept) of the sample distribution.

refresh_question()[source]¶: Refetch the question data from Metaculus, used when the question data might have changed

sample_community()[source]¶: Get one sample from the distribution of the Metaculus community’s prediction on this question (sample is denormalized/on the the true scale of the question)

set_data(key, value)[source]¶

Set key on data dict

Parameters

key (str) –
value (Any) –

static to_dataframe(questions, columns=['id', 'title', 'resolve_time'])[source]¶

Summarize a list of questions in a dataframe

Parameters

questions (List[MetaculusQuestion]) – questions to summarize
columns (List[str]) – list of column names as strings

Return type

DataFrame

Returns

pandas dataframe summarizing the questions

ContinuousQuestion¶

class ContinuousQuestion(id, metaculus, data, name=None)[source]¶

A continuous Metaculus question – a question of the form, what’s your distribution on this event?

change_since(since)[source]¶

Calculate change in community prediction median between the argument and most recent prediction

Parameters: since (datetime) – datetime
Returns: change in median community prediction since datetime

community_dist()[source]¶

Get the community distribution for this question NB: currently missing the part of the distribtion outside the question range

Return type: PointDensity
Returns: the (true-scale) community distribution as a histogram.

community_dist_in_range()[source]¶

A distribution for the portion of the current normalized community prediction that’s within the question’s range, i.e. 0…(len(self.prediction_histogram)-1).

Returns: distribution on integers

denormalize_samples(samples)[source]¶: Map samples from the Metaculus normalized scale to the true scale :param samples: samples on the normalized scale :return: samples from a distribution answering the prediction question

(true scale)

property has_predictions¶: Are there any predictions for the question yet?

property high_open¶

Are you allowed to place probability mass above the top of this question’s range?

Return type: bool

property latest_community_percentiles¶

Returns: Some percentiles for the metaculus commununity’s latest rough prediction. prediction_histogram returns a more fine-grained histogram of the community prediction

property low_open¶

Are you allowed to place probability mass below the bottom of this question’s range?

Return type: bool

normalize_samples(samples)[source]¶

Map samples from their true scale to the Metaculus normalized scale :param samples: samples from a distribution answering the prediction question

(true scale)

Returns: samples on the normalized scale

property p_outside¶

How much probability mass is outside this question’s range?

Return type: Optional[float]

prepare_logistic(normalized_dist)[source]¶

Transform a single logistic distribution by clipping the parameters and adding scale information as needed for submission to Metaculus. The loc and scale have to be within a certain range for the Metaculus API to accept the prediction.

Parameters: dist – a (normalized) logistic distribution
Return type: Logistic
Returns: a transformed logistic distribution

prepare_logistic_mixture(normalized_dist)[source]¶

Transform a (normalized) logistic mixture distribution as needed for submission to Metaculus.

Parameters: normalized_dist (LogisticMixture) – normalized mixture dist
Return type: LogisticMixture
Returns: normalized dist clipped and formatted for the API

property question_range¶: Range of answers specified when the question was created

sample_community()[source]¶

Sample an approximation of the entire current community prediction, on the true scale of the question. The main reason that it’s just an approximation is that we don’t know exactly where probability mass outside of the question range should be, so we place it arbitrarily

Return type: float
Returns: One sample on the true scale

sample_normalized_community()[source]¶

Sample an approximation of the entire current community prediction, on the normalized scale. The main reason that it’s just an approximation is that we don’t know exactly where probability mass outside of the question range should be, so we place it arbitrarily.

Return type: float
Returns: One sample on the normalized scale

show_community_prediction(percent_kept=0.95, side_cut_from='both', num_samples=1000, **kwargs)[source]¶

Plot samples from the community prediction on this question

Parameters

percent_kept (float) – percentage of sample distrubtion to keep
side_cut_from (str) – which side to cut tails from, either ‘both’,’lower’, or ‘upper’
num_samples (int) – number of samples from the community
kwargs – additional plotting parameters

show_prediction(samples, plot_samples=True, plot_fitted=False, percent_kept=0.95, side_cut_from='both', show_community=False, num_samples=1000, **kwargs)[source]¶

Plot prediction on the true question scale from samples or a submission object. Optionally compare prediction against a sample from the distribution of community predictions

Parameters

samples – samples from a distribution answering the prediction question (true scale). Can either be a 1-d array corresponding to one model’s predictions, or a pandas DataFrame with each column corresponding to a distinct model’s predictions
plot_samples (bool) – boolean indicating whether to plot the raw samples
plot_fitted (bool) – boolean indicating whether to compute Logistic Mixture Params from samples and plot the resulting fitted distribution. Note this is currently only supported for 1-d samples
percent_kept (float) – percentage of sample distrubtion to keep
side_cut_from (str) – which side to cut tails from, either ‘both’,’lower’, or ‘upper’
show_community (bool) – boolean indicating whether comparison to community predictions should be made
num_samples (int) – number of samples from the community
kwargs – additional plotting parameters

submit_from_samples(samples, verbose=False)[source]¶

Submit prediction to Metaculus based on samples from a prediction distribution

Parameters: samples – Samples from a distribution answering the prediction question
Return type: Response
Returns: logistic mixture params clipped and formatted to submit to Metaculus

LinearQuestion¶

class LinearQuestion(id, metaculus, data, name=None)[source]¶

A continuous Metaculus question that’s on a linear (as opposed to a log) scale”

get_true_scale_logistic(normalized_dist)[source]¶

Convert a normalized logistic distribution to a logistic on the true scale of the question.

Parameters: normalized_dist (Logistic) – normalized logistic distribution
Return type: Logistic
Returns: logistic distribution on the true scale of the question

get_true_scale_mixture(normalized_dist)[source]¶

Convert a normalized logistic mixture distribution to a logistic on the true scale of the question.

Parameters: normalized_dist (LogisticMixture) – normalized logistic mixture dist
Return type: LogisticMixture
Returns: same distribution rescaled to the true scale of the question

LogQuestion¶

class LogQuestion(id, metaculus, data, name=None)[source]¶

LinearDateQuestion¶

class LinearDateQuestion(id, metaculus, data, name=None)[source]¶

date_to_timestamp(date)[source]¶

Turn a date string in %Y-%m-%d format into a timestamp. Metaculus uses this format for dates when specifying the range of a date question. We’re assuming Metaculus is interpreting these date strings as UTC.

Returns: A Unix timestamp

sample_community()[source]¶

Sample an approximation of the entire current community prediction, on the true scale of the question.

Returns: One sample on the true scale

BinaryQuestion¶

class BinaryQuestion(id, metaculus, data, name=None)[source]¶

A binary Metaculus question – how likely is this event to happen, from 0 to 1?

change_since(since)[source]¶

Calculate change in community prediction between the argument and most recent prediction

Parameters: since (datetime) – datetime
Returns: change in community prediction since datetime

sample_community()[source]¶

Sample from the Metaculus community distribution (Bernoulli).

Return type: bool

score_my_predictions()[source]¶

Score all of my predictions according to the question resolution (or according to the current community prediction if the resolution isn’t available)

Returns: List of ScoredPredictions with Brier scores

score_prediction(prediction, resolution)[source]¶

Score a prediction relative to a resolution using a Brier Score.

Parameters

prediction – how likely is the event to happen, from 0 to 1?
resolution (float) – how likely is the event to happen, from 0 to 1? (0 if it didn’t, 1 if it did)

Return type

ScoredPrediction

Returns

ScoredPrediction with Brier score, see https://en.wikipedia.org/wiki/Brier_score#Definition 0 is best, 1 is worst, 0.25 is chance

submit(p)[source]¶

Submit a prediction to my Metaculus account

Parameters: p (float) – how likely is the event to happen, from 0 to 1?
Return type: Response