{"id":21222094,"url":"https://github.com/adamelkholyy/ma-probcog","last_synced_at":"2025-08-28T02:36:01.430Z","repository":{"id":187215334,"uuid":"676508561","full_name":"adamelkholyy/ma-probcog","owner":"adamelkholyy","description":"Notebook from masters course in Probabilistic Cognitive Modelling @ University of Helsinki. Includes manual calculation of response distribution and Bayesian observer likelihoods.","archived":false,"fork":false,"pushed_at":"2024-10-21T11:46:15.000Z","size":859,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-11-20T22:57:49.039Z","etag":null,"topics":["bayesian-inference","bayesian-machine-learning","probabilistic-models"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/adamelkholyy.png","metadata":{"files":{"readme":"README.MD","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-09T11:07:38.000Z","updated_at":"2024-10-21T11:46:19.000Z","dependencies_parsed_at":null,"dependency_job_id":"47b722f2-c7f3-4ebf-8955-3e4cb36954a6","html_url":"https://github.com/adamelkholyy/ma-probcog","commit_stats":null,"previous_names":["adamelkholyy/probabilistic-cognitive-modelling","adamelkholyy/ma-probcog"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adamelkholyy%2Fma-probcog","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adamelkholyy%2Fma-probcog/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adamelkholyy%2Fma-probcog/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/adamelkholyy%2Fma-probcog/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/adamelkholyy","download_url":"https://codeload.github.com/adamelkholyy/ma-probcog/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234988228,"owners_count":18918097,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-inference","bayesian-machine-learning","probabilistic-models"],"created_at":"2024-11-20T22:39:40.414Z","updated_at":"2025-01-21T17:13:48.722Z","avatar_url":"https://github.com/adamelkholyy.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"University of Helsinki, Master's Programme in Data Science  \nDATA20047 Probabilistic Cognitive Modelling - Spring 2023  \nLuigi Acerbi  \nAdam El Kholy\n\n## Calculating Response Distribution and Model Fitting\n## References\n\n- \\[**MKG22**\\] Ma WJ, Körding K, and Goldreich D. \"Bayesian Models of Perception and Action: An Introduction\". MIT Press, 2022.\n- \\[**AWV12**\\] Acerbi L, Wolpert DM, Vijayakumar S. \"Internal Representations of Temporal Statistics and Feedback Calibrate Motor-Sensory Interval Timing\". *PLoS Computational Biology*, 2012. [Link](https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1002771)\n\n\n```python\n# set-up -- do not change\nimport numpy as np\nimport numpy.random as npr\nimport scipy as sp\nimport scipy.stats as sps\nimport pandas as pd\nimport matplotlib.pyplot as plt\nnpr.seed(1)\n```\n\n# Question 2.1 (7 pts)\n\n\u003e This question is about computing the total (root mean squared) error (RMSE) for a Bayesian observer, as seen in Week 3 of the course. The take-home message here is that the Bayesian observer whose prior matches the true empirical distribution of stimuli will perform best at the task (lower RMSE), compared to a Bayesian observer with an incorrect (aka *mismatched*) belief about the distribution of stimuli (i.e., whose prior does not match the true stimulus distribution). See Chapter 4.5 of \\[**MKG22**\\] and the lecture notes for Week 3.\n\nA Bayesian observer is estimating a stimulus with empirical distribution $p(s) = \\text{Uniform}(s; -5, 5)$.\nThe measurement distribution and likelihood are Gaussian $p(x|s) = \\mathcal{N}\\left(x; s, \\sigma^2 \\right)$ with $\\sigma = 2$.\nWe assume that the observer uses the posterior mean estimator $\\hat{s}_{PM}$ and we ignore response noise. However, we consider the observer uses as prior a distribution $q(s)$ which might differ from the true prior (mismatched prior).\n\n- a) Compute the total RMSE assuming $q(s) = p(s)$, i.e. the observer uses the true stimulus distribution as prior.\n- b) Compute the total RMSE assuming the observer uses an approximate Gaussian prior, $q(s) = \\mathcal{N}\\left(s; \\mu_s, \\sigma_s^2 \\right)$ with mean $\\mu_s$ and variance $\\sigma^2_s$ equal to the mean and variance of the true stimulus distribution. *Hint*: You can find the variance of a continuous uniform distribution [here](https://en.wikipedia.org/wiki/Continuous_uniform_distribution).\n- c) Compute the total RMSE assuming the observer uses as prior a mismatched, wider Uniform distribution, $q(s) = \\text{Uniform}\\left(s; -8, 8 \\right)$.\n\nReport your results in Moodle. The accepted tolerance is $\\pm 0.01$ from the true value.\n\n*Hints*: \n- Remember that the (total) RMSE of an estimator $\\hat{s}$ is computed as\n$$\n\\text{RMSE}[\\hat{s}] = \\sqrt{\\int \\text{MSE}\\left[\\hat{s}|s\\right] p(s) ds}\n$$\n  where $p(s)$ is the true empirical distribution and $\\text{MSE}\\left[\\hat{s}|s\\right]$ is the mean squared error at each stimulus, defined as\n$$\n\\text{MSE}\\left[\\hat{s}|s\\right] = \\mathbb{E}_{\\hat{s}|s}\\left[\\left(\\hat{s}-s\\right)^2|s \\right] = \\text{Bias}\\left[\\hat{s}|s\\right]^2 + \\text{Var}\\left[\\hat{s}|s\\right],\n$$\n  where the definitions for bias and variance can be found in the textbook or lecture notes.\n- Note that changing the prior $q(s)$ will change $\\hat{s}(x)$, but nothing else! So once you manage to compute (a), you should be able to compute (b) and (c) with a small change to the code (only where $\\hat{s}(x)$ is computed).\n- You may want to check out Exercise 3.3 of the workshops.\n\n\n```python\ndef compute_posterior_mean_1d(s_grid, prior_pdf, likelihood):\n    \"\"\"Compute s_hat_PM (posterior mean) for an arbitrary prior and likelihood in 1d.\"\"\"\n    ds = s_grid.flatten()[1] - s_grid.flatten()[0] # grid spacing\n    protoposterior = prior_pdf * likelihood    \n    normalization_constant = sp.integrate.romb(protoposterior, dx=ds, axis=0)\n    posterior_pdf = protoposterior / normalization_constant\n    posterior_mean = sp.integrate.romb(s_grid * posterior_pdf, dx=ds, axis=0)\n    return posterior_mean\n\ndef compute_and_plot_metrics(r_col, s_hat, stimulus_pdf, label):\n    \"\"\"Compute bias, variance, conditional and overall MSE.\"\"\"\n    bias = sp.integrate.romb(s_hat * sps.norm.pdf(x_row,r_col,sigma),dx=dx,axis=1) - r_col.flatten()\n    std = np.sqrt(sp.integrate.romb(s_hat**2 * sps.norm.pdf(x_row,r_col,sigma),dx=dx,axis=1) \n        - sp.integrate.romb(s_hat * sps.norm.pdf(x_row,r_col,sigma),dx=dx,axis=1)**2)\n    rmse = np.sqrt(bias**2 + std**2)\n    \n    # Plot only where the support of the Uniform stimulus distribution is nonzero\n    s_range = r_col.copy()\n    s_range[np.logical_or(s_range \u003c a, s_range \u003e b)] = np.nan\n\n    plt.plot(s_range.flatten(),bias.flatten(),label='bias')\n    plt.plot(s_range.flatten(),std.flatten(),label='standard deviation')\n    plt.plot(s_range.flatten(),rmse.flatten(),label='RMSE')\n    plt.plot((a,b),(0,0),':k')\n    plt.xlabel(r'$s$')\n    plt.ylabel(r'Metrics')\n    plt.title('Bias, standard deviation and RMSE (' + label + ')')\n    plt.legend()\n    plt.show()\n\n    total_rmse = np.sqrt(sp.integrate.romb(rmse**2 * stimulus_pdf,dx=ds))\n    print('Total RMSE (' + label + '): {}'.format(total_rmse))\n```\n\n### 2.1\n\n\n```python\na = -5.\nb = 5.\nsigma = 2.\n\nprint('a)')\nNx = 2**9+1\nNs = 2**9+1\n\n# define the grid\nlb = a - sigma*5.\nub = b + sigma*5.\nx_row = np.linspace(lb, ub, Nx).reshape((1,Nx)) # make x a row vector\nr_col = np.linspace(lb, ub, Ns).reshape((Ns,1)) # make s a column vector\ndx = x_row.flatten()[1] - x_row.flatten()[0]\nds = r_col.flatten()[1] - r_col.flatten()[0]\n\nprior_pdf = sps.uniform.pdf(r_col, a, b-a)\n\nlikelihood = sps.norm.pdf(x_row, r_col, sigma) \ns_hat_row = compute_posterior_mean_1d(r_col, prior_pdf, likelihood).reshape((1,Nx)) # keep as a row vector\n\nstimulus_pdf = prior_pdf.copy().flatten()# As in most cases we consider, these two pdfs are the same\ncompute_and_plot_metrics(r_col, s_hat_row, stimulus_pdf, 'posterior mean estimate')    \n```\n\n    a)\n    \n\n\n![png](images/output_5_1.png)\n\n\n    Total RMSE (posterior mean estimate): 1.6048573201415854\n    \n\n\n```python\na = -5.\nb = 5.\nsigma = 2.\n\nprint('b)')\nNx = 2**9+1\nNs = 2**9+1\n\n# define the grid\nlb = a - sigma*5.\nub = b + sigma*5.\nx_row = np.linspace(lb, ub, Nx).reshape((1,Nx)) # make x a row vector\nr_col = np.linspace(lb, ub, Ns).reshape((Ns,1)) # make s a column vector\ndx = x_row.flatten()[1] - x_row.flatten()[0]\nds = r_col.flatten()[1] - r_col.flatten()[0]\n\nvariance = ((b - a)**2)/12\nmean = (b + a)/2\nsigma = np.sqrt(variance)\nprior_pdf = sps.norm.pdf(r_col, mean, sigma)\n\n#New sigma for likelihood\nsigma=2\nlikelihood = sps.norm.pdf(x_row, r_col, sigma) \ns_hat_row = compute_posterior_mean_1d(r_col, prior_pdf, likelihood).reshape((1,Nx)) # keep as a row vector\n\nstimulus_pdf = prior_pdf.copy().flatten()# As in most cases we consider, these two pdfs are the same\ncompute_and_plot_metrics(r_col, s_hat_row, stimulus_pdf, 'posterior mean estimate')    \n```\n\n    b)\n    \n\n\n![png](images/output_6_1.png)\n\n\n    Total RMSE (posterior mean estimate): 1.6446507755833726\n    \n\n\n```python\na = -8.\nb = 8.\nsigma =1.9\n\nprint('c)')\nNx = 2**9+1\nNs = 2**9+1\n\n\n# define the grid\nlb = a - sigma*8.\nub = b + sigma*8.\nx_row = np.linspace(lb, ub, Nx).reshape((1,Nx)) # make x a row vector\nr_col = np.linspace(lb, ub, Ns).reshape((Ns,1)) # make s a column vector\ndx = x_row.flatten()[1] - x_row.flatten()[0]\nds = r_col.flatten()[1] - r_col.flatten()[0]\n\n\nprior_pdf = sps.uniform.pdf(r_col, a, b-a)\n\nlikelihood = sps.norm.pdf(x_row, r_col, sigma) \ns_hat_row = compute_posterior_mean_1d(r_col, prior_pdf, likelihood).reshape((1,Nx)) # keep as a row vector\n\nstimulus_pdf = prior_pdf.copy().flatten()# As in most cases we consider, these two pdfs are the same\ncompute_and_plot_metrics(r_col, s_hat_row, stimulus_pdf, 'posterior mean estimate')    \n```\n\n    c)\n    \n\n\n![png](images/output_7_1.png)\n\n\n    Total RMSE (posterior mean estimate): 1.6842547306881315\n    \n\n# Question 2.2 (6 pts)\n\n\n\u003e In this question, we compute the response distribution for \\[**JS10**\\] under different assumptions about the Bayesian observer.\n\nConsider the time perception experiment from \\[**JS10**\\] which we analyzed in Exercise 3.4.\nWe recall the setup below. Note that there are differences from Exercise 3.4 (marked as **NEW**):\n- In this experiment, an observer is asked to judge the time interval $s$ between two flashes, measured in milliseconds (ms). In each trial, the duration is drawn from an interval distribution $p(s)$. \n- The experiment consist of three separate blocks of sessions run over multiple days. Each experimental block is identical except for the distribution of intervals $p(s)$. The distribution of time intervals in the three blocks are: \n  - $p_\\text{short}(s) = \\text{Uniform}\\left(s; 494, 847\\right)$\n  - $p_\\text{medium}(s) = \\text{Uniform}\\left(s; 671, 1023\\right)$\n  - $p_\\text{long}(s) = \\text{Uniform}\\left(s; 847,1200\\right)$\n- The observer's measurement distribution follows *Weber's law* (known in time perception as the \"scalar property\" of temporal judgment). According to this empirical law, the measurement noise is roughly linearly proportional to the magnitude of the stimulus. In formulas, $$p(x|s) = \\mathcal{N}\\left(x|s,\\sigma^2(s)\\right) \\qquad \\text{with} \\quad \\sigma(s) = w_s \\cdot s$$\n  where $w_s$ is known as *Weber's fraction*. Typical values of $w_s$ in timing are around 0.05-0.2, here we assume $w_s = 0.1$.\n- It is assumed that, after some practice, the observer develops a prior $p(s)$ which matches the stimulus distribution used in that block of sessions (and that the likelihood also matches the measurement distribution).\n- **NEW**: The observer responds with a deterministic estimate $\\hat{s}_\\text{MAP}$ which we assume is the mode of the posterior (also known as *maximum-a-posteriori* or MAP estimate). \n- **NEW**: The response is corrupted by motor noise which is proportional to the estimate:\n$$p(r|\\hat{s}) = \\mathcal{N}\\left(r; \\hat{s}, \\sigma_\\text{m}^2(\\hat{s})\\right) \\qquad \\text{with} \\quad \\sigma_\\text{m}(\\hat{s}) = w_\\text{m} \\cdot \\hat{s}$$ \n  where $w_\\text{m}$ represents the Weber's fraction for the motor noise. Here we assume $w_\\text{m} = 0.05$.\n  \n-------------------------------\n\nIn this exercise, we look at the *distribution of responses* $p(r|s)$ that the experimenter would observe for a given stimulus in the three different experimental blocks (short, medium, or long). We consider the stimulus $s^\\star = 847$ ms which appears in all three experimental blocks.\n\n- a) Compute $p(r|s = s^\\star)$ for the \"short\" block. Compute the mean and standard deviation of $p(r|s = s^\\star)$ and report them on Moodle.\n- b) Compute $p(r|s = s^\\star)$ for the \"medium\" block. Compute the mean and standard deviation of $p(r|s = s^\\star)$ and report them on Moodle.\n- c) Compute $p(r|s = s^\\star)$ for the \"long\" block. Compute the mean and standard deviation of $p(r|s = s^\\star)$ and report them on Moodle.\n\nThe accepted tolerance for the solutions is $\\pm 0.2$ ms for (a) and (b), and $\\pm 0.5$ ms for (c).\n\n*Hints*: \n- Be careful that the likelihood, $p(x|s)$ as a function of $s$, is *not* Gaussian, because $\\sigma(s)$ is not constant in $s$. As a consequence, the posterior will *not* be Gaussian. This affects the MAP estimate, $\\hat{s}_\\text{MAP}$, which you will need to compute numerically.\n- To compute the response distribution, remember the definition:\n$$\np(r|s) = \\int p(r|\\hat{s}(x)) p(x|s) dx,\n$$\n  which you can solve via numerical integration.\n- It is recommended that you first compute $\\hat{s}_\\text{MAP}(x)$ for a grid of $x$, and then compute the response distribution numerically via the integral above.\n- The MAP estimate $\\hat{s}_\\text{MAP}$ is the value of $s$ that maximizes the posterior $p(s|x)$. Note that this value does not depend on the normalization constant, so you can compute $p(s|x) \\propto p(s) p(x|s)$ for a (fine) grid of values `s_grid` and take the argument $s$ that maximizes this quantity.\n\n\n```python\ndef sanity_check(x_row,r_col,integrand, s_hat_row, response_integral_pdf):\n    if x_row[0][0] == 345.79999999999995:\n        idx = (np.abs(x_row[0] - 759)).argmin()\n        print(\"Sanity check: x=759, sMap=751.50\")\n        print(f\"x={x_row[0][idx]}, sMAP={s_hat_row[0][idx]}\")\n        \n        print('x_row.shape: {}'.format(x_row.shape))\n        print('r_col.shape: {}'.format(r_col.shape))\n        print('s_hat_row.shape: {}'.format(s_hat_row.shape))\n        print('integrand.shape: {}'.format(integrand.shape))\n        print('response_integral_pdf.shape: {}'.format(response_integral_pdf.shape))\n```\n\n\n```python\ndef plot_distribution(x, y, x_label, y_label, title):\n    plt.bar(x, y , color='red', width=0.5)\n    plt.xlabel(x_label)\n    plt.ylabel(y_label)\n    plt.title(title)\n    plt.show()\n```\n\n\n```python\na = np.array((494, 671, 847))\nb = np.array((847,1023,1200))\nlabel = ('short','medium','long')\nw_s = 0.1\nw_m = 0.05\n\n#res=11, clearance=5 for correct results\nres = 11\nNx = 2**res+1\nNs = 2**res+1\n\n# calculate sMAP\ndef compute_maximum_a_posteriori_1d(s_grid, prior_pdf, likelihood):\n    posterior = likelihood * prior_pdf\n    map_estimate = s_grid[np.argmax(posterior, axis=0)]\n    return np.transpose(map_estimate) #must be a row vector\n\n# calculate response distribution\ndef calculate_response_dist(x_row, r_col, likelihood, prior_pdf, _dx):\n    s_hat_row = compute_maximum_a_posteriori_1d(r_col, prior_pdf, likelihood)\n    measurement_dist = sps.norm.pdf(x_row, 847, w_s*847)\n    integrand = sps.norm.pdf(r_col, s_hat_row, w_m * s_hat_row) * measurement_dist \n    response_dist = sp.integrate.romb(integrand, dx=_dx, axis=1)\n    sanity_check(x_row,r_col,integrand, s_hat_row, response_dist)\n    return response_dist\n\nfor iter in range(a.size):\n    # define the grid    \n    clearance = 5\n    lb = a[iter] - (w_s*a[iter])*clearance\n    ub = b[iter] + (w_s*b[iter])*clearance\n    x_row = np.linspace(lb, ub, Nx).reshape((1,Nx)) # make x a row vector\n    r_col = np.linspace(lb, ub, Ns).reshape((Ns,1)) # make s a column vector\n    _dx = x_row.flatten()[1] - x_row.flatten()[0]\n\n    #calculate response distribution p(r|s=847)\n    prior_pdf = sps.uniform.pdf(r_col, loc=a[iter], scale=b[iter]-a[iter])    \n    likelihood = sps.norm.pdf(x_row, r_col, w_s*r_col)\n    response_dist = calculate_response_dist(x_row, r_col, likelihood, prior_pdf, _dx)\n\n    #mean and SD calculation\n    flat = r_col.flatten()\n    mean_r = sp.integrate.romb(flat * response_dist, dx=_dx)\n    var = sp.integrate.romb((flat - mean_r)**2 * response_dist, dx=_dx)\n    std = np.sqrt(var)\n    \n    #results\n    print(\"------------\"+ label[iter] + \"------------\")\n    print(f\"mean= {mean_r}\")\n    print(f\"std= {std}\")\n    print(\"1 ~=\", sp.integrate.romb(response_dist, dx=_dx)) #should be apprpx 1\n\nplot_distribution(r_col.flatten(), response_dist , \"r\", \"p(r|s=847)\", \n                  \"Long block response distribution\")\n```\n\n    ------------short------------\n    mean= 809.0869476150397\n    std= 65.67980760174596\n    1 ~= 0.9999997133477272\n    ------------medium------------\n    mean= 839.0043753778564\n    std= 91.38405383325505\n    1 ~= 0.9999999992244515\n    ------------long------------\n    mean= 876.7796740577605\n    std= 63.521941134265894\n    1 ~= 0.9999997133484282\n    \n\n\n![png](images/output_11_1.png)\n\n\n# Question 2.3 (6 pts)\n\n\u003e The key quantity for model fitting is the log-likelihood for a dataset and some model parameters. In this exercise, we compute the log-likelihood for a Bayesian observer model which also includes the possibility of *lapses*, a common mechanism used in cognitive science to explain away \"random\" responses and subjects' mistakes.\n\nIn this question, we consider the datasets from Experiment 3 of \\[**AWV12**\\], as seen in Week 4.. The experimental setup which involves time perception and interval reproduction is very similar to \\[**JS10**\\], so we can consider the same type of models.\n\nWe analyze the data with the `gaussianobserverwithlapse` model, defines as follows:\n\n- We assume the observer builds a (mismatched) Gaussian prior $p(s) = \\mathcal{N}\\left(s| \\mu_\\text{prior}, \\sigma_\\text{prior}^2 \\right)$ over the stimuli (time intervals). \n- We assume that the measurement distribution and likelihood are also Gaussian, $p(x|s) = \\mathcal{N}\\left(x| s, \\sigma^2 \\right)$.\n- The observer uses the *posterior mean* estimator for the value of the stimulus, $\\hat{s}_\\text{PM}$.\n- Gaussian motor response noise is added to the estimate, $p(r|\\hat{s}) = \\mathcal{N}\\left(r| \\hat{s}, \\sigma_\\text{motor}^2 \\right)$.\n- In each trial, the observer lapses with probability $\\lambda$ (the *lapse rate*), in which case the response is drawn from $p_\\text{lapse}(r) = \\text{Uniform}\\left(r; 0, 1500 \\right)$ ms. Otherwise, the observer responds normally (according to $p(r|\\hat{s})$ described above) with probability $1 - \\lambda$. \n- The parameters of this model are $\\mathbf{\\theta} = \\left(\\mu_\\text{prior}, \\sigma_\\text{prior}, \\sigma, \\sigma_\\text{motor}, \\lambda \\right)$.\n\nFor this question, we consider parameters $\\mathbf{\\theta}_\\star = \\left(\\mu_\\text{prior} = 780, \\sigma_\\text{prior} = 140, \\sigma = 90, \\sigma_\\text{motor} = 60, \\lambda = 0.02 \\right)$. \n\n- a) Compute the log-likelihood of model parameter $\\theta_\\star$ for the dataset of subject 2.\n- b) Compute the log-likelihood of model parameter $\\theta_\\star$ for the dataset of subject 5.\n\nReport your results on Moodle with high precision.\n\n*Hint*:\n- If you use code from the lectures, be careful about the model definition, as there may be subtle differences.\n\n\n```python\n# Load data of Experiment 3 of [AWV12] from .csv file to a Pandas dataframe\ndf = pd.read_csv('https://www2.helsinki.fi/sites/default/files/atoms/files/awv12_exp3.csv')\n\n# Remove unused columns (they deal with performance feedback, which we ignore in this lecture)\ndf.drop(df.columns[[6, 7, 8]], axis=1, inplace=True)\n\n# Remove rows with NaNs\ndf.dropna(axis=0, inplace=True)\n\ndf.head()\n```\n\n\n\n\n\u003cdiv\u003e\n\u003cstyle scoped\u003e\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n\u003c/style\u003e\n\u003ctable border=\"1\" class=\"dataframe\"\u003e\n  \u003cthead\u003e\n    \u003ctr style=\"text-align: right;\"\u003e\n      \u003cth\u003e\u003c/th\u003e\n      \u003cth\u003eSubject id\u003c/th\u003e\n      \u003cth\u003eSession id\u003c/th\u003e\n      \u003cth\u003eRun id\u003c/th\u003e\n      \u003cth\u003eStimulus (ms)\u003c/th\u003e\n      \u003cth\u003eResponse (ms)\u003c/th\u003e\n      \u003cth\u003eStimulus id\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003cth\u003e0\u003c/th\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e973.327049\u003c/td\u003e\n      \u003ctd\u003e862.947945\u003c/td\u003e\n      \u003ctd\u003e6.0\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e1\u003c/th\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e677.519900\u003c/td\u003e\n      \u003ctd\u003e574.920276\u003c/td\u003e\n      \u003ctd\u003e2.0\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e2\u003c/th\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e826.253049\u003c/td\u003e\n      \u003ctd\u003e870.995615\u003c/td\u003e\n      \u003ctd\u003e4.0\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e3\u003c/th\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e677.854859\u003c/td\u003e\n      \u003ctd\u003e695.055098\u003c/td\u003e\n      \u003ctd\u003e2.0\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003cth\u003e4\u003c/th\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e1\u003c/td\u003e\n      \u003ctd\u003e598.501198\u003c/td\u003e\n      \u003ctd\u003e632.981845\u003c/td\u003e\n      \u003ctd\u003e1.0\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\n\n\n\n```python\n# Example code to extract stimuli and responses for a single subject (here S1)\nsubject = 1\ns = np.array(df['Stimulus (ms)'][df['Subject id'] == subject])\nr = np.array(df['Response (ms)'][df['Subject id'] == subject])\nprint('s.shape:', s.shape)\nprint('r.shape:', r.shape)\n\nplt.scatter(s, r)\nplt.xlabel('Stimulus $s$ (ms)')\nplt.ylabel('Response $r$ (ms)')\nplt.title('S' + str(subject))\nplt.show()\n```\n\n    s.shape: (2520,)\n    r.shape: (2520,)\n    \n\n\n![png](images/output_14_1.png)\n\n\n\n```python\ndef gaussian_response(s,theta):\n    \"\"\"Compute mean and standard deviation of p(r|s; theta).\"\"\"\n    # Unpack parameter vector theta\n    mu_prior = theta[0]\n    sigma_prior = theta[1]\n    sigma = theta[2]\n    sigma_motor = theta[3]\n    # Compute mean and std of the response\n    w = sigma_prior**2/(sigma_prior**2 + sigma**2)    \n    mu_resp = w*s + (1-w)*mu_prior\n    sigma_resp = np.sqrt(w**2*sigma**2 + sigma_motor**2)\n    return mu_resp, sigma_resp\n\ndef idealgaussianobserverwithlapse_loglike(theta,s_vec,r_vec):\n    \"\"\"Log-likelihood of ideal Gaussian observer with added lapse.\"\"\"\n    mu_prior = 780\n    sigma_prior = 140\n    \n    sigma = theta[0]\n    sigma_motor = theta[1]\n    lapse_rate = theta[2]\n    \n    lapse_pdf = 1/1500.\n    \n    mu_resp, sigma_resp = gaussian_response(s_vec,np.array((mu_prior,sigma_prior,sigma,sigma_motor)))\n    # First, compute log-likelihood without probability of lapse\n    loglike_vec = sps.norm.logpdf(r_vec,mu_resp,sigma_resp) # Vector of log-likelihood per trials\n    # Now, add the probability of lapse\n    if lapse_rate \u003e 0.:\n        likelihood_vec = np.exp(loglike_vec) # Exponentiate back to the likelihood\n        likelihood_with_lapse_vec = (1-lapse_rate)*likelihood_vec + lapse_rate*lapse_pdf\n        loglike_vec = np.log(likelihood_with_lapse_vec)\n        # This code snippet below uses the logsumexp trick, which is numerically more stable\n        # loglapse = np.log(lapse_rate*lapse_pdf)\n        # M = np.maximum(loglike, loglapse)\n        # loglike = np.log((1-lapse_rate)*np.exp(loglike-M) + np.exp(loglapse-M)) + M        \n    return np.sum(loglike_vec)\n\ntheta1 = (90,60,0.02)\n\nprint('b)')\n\nsubject = 2\ns = np.array(df['Stimulus (ms)'][df['Subject id'] == subject])\nr = np.array(df['Response (ms)'][df['Subject id'] == subject])\nloglike1 = idealgaussianobserverwithlapse_loglike(theta1,s,r)\n\nprint('SUBJECT 2: The log-likelihood of theta_1 = {} (dataset S{}) is: {}'.format(\n    theta1, subject, loglike1))\n\n\n###################################################################\n\nsubject = 5\ns = np.array(df['Stimulus (ms)'][df['Subject id'] == subject])\nr = np.array(df['Response (ms)'][df['Subject id'] == subject])\nloglike1 = idealgaussianobserverwithlapse_loglike(theta1,s,r)\n\nprint('SUBJECT 5: The log-likelihood of theta_1 = {} (dataset S{}) is: {}'.format(\n    theta1, subject, loglike1))\n\n```\n\n    b)\n    SUBJECT 2: The log-likelihood of theta_1 = (90, 60, 0.02) (dataset S2) is: -8577.318546123512\n    SUBJECT 5: The log-likelihood of theta_1 = (90, 60, 0.02) (dataset S5) is: -9105.655523964238\n    \n\n# Question 2.4 (6 pts)\n\n\u003e When fitting models to data, the experimenter may be interested in how model parameters are represented across the population (here represented by the group of subjects). A simple way to look at this is to look at the distribution of maximum-likelihood estimates for the parameters across subjects, in first instance by looking their mean and variability.\n\nWe consider here the `idealgaussianobserverwithlapse` model. This model is the same as the `gaussianobserverwithlapse` of Question 2.3, but with $\\mu_\\text{prior} = 787.5$ ms and $\\sigma_\\text{prior} = 128.1$ ms fixed. Thus, the model has three free parameters, $\\theta = \\left(\\sigma, \\sigma_\\text{motor}, \\lambda \\right)$. Fit the model using maximum-likelihood estimation.\n\n- a) First, fit the `idealgaussianobserverwithlapse` model to the six subjects' datasets (separately for each subject's data). For each maximum-likelihood estimate (MLE) of parameters $\\sigma, \\sigma_\\text{motor}, \\lambda$, report in Moodle the mean and standard deviation across the six subjects. For the standard deviation, use the correction for degrees of freedom (that is, `np.std(..., ddof=1)`).\n- b) Now fit the pooled data of all subjects as a single dataset (as if all data were collected from a single uber-subject). Report the maximum-likelihood estimate of $\\sigma, \\sigma_\\text{motor}, \\lambda$ for the pooled data in Moodle.\n\n*Hints*: \n- If you use code for the `idealgaussianobserverwithlapse` model from the lectures, be careful about the model definition.\n- As a sanity check that you have coded the log-likelihood function correctly, check that the log-likelihood of the dataset of subject 1 for $\\theta_\\star = \\left(\\sigma = 90, \\sigma_\\text{motor} = 80, \\lambda = 0.02\\right)$ is $\\log \\mathcal{L}(\\theta_\\star; \\mathcal{D}_1) \\approx -14709.795\\ldots$\n\n*Note*: Fitting individual subjects' data is the best approach to describe invidual behavior in cognitive science, but sometimes you will see studies only looking at pooled/group data. Be careful that pooling might hide what really happens, only giving a snapshot of the average behavior of the group, which might not correspond to what individuals do.\n\n\n```python\ndef idealgaussianobserverwithlapse_loglike(theta,s_vec,r_vec):\n    \"\"\"Log-likelihood of ideal Gaussian observer with added lapse.\"\"\"\n    mu_prior = 787.5\n    sigma_prior = 128.1\n    sigma = theta[0]\n    sigma_motor = theta[1]    \n    lapse_rate = theta[2]\n    lapse_pdf = 1/1500.\n    mu_resp, sigma_resp = gaussian_response(s_vec,np.array((mu_prior,sigma_prior,sigma,sigma_motor)))\n    # First, compute log-likelihood without probability of lapse\n    loglike_vec = sps.norm.logpdf(r_vec,mu_resp,sigma_resp) # Vector of log-likelihood per trials\n    # Now, add the probability of lapse\n    if lapse_rate \u003e 0.:\n        likelihood_vec = np.exp(loglike_vec) # Exponentiate back to the likelihood\n        likelihood_with_lapse_vec = (1-lapse_rate)*likelihood_vec + lapse_rate*lapse_pdf\n        loglike_vec = np.log(likelihood_with_lapse_vec)   \n    return np.sum(loglike_vec)\n\ndef multioptimize(target_fun,lower_bounds,upper_bounds,plausible_lower_bounds,plausible_upper_bounds,num_runs=3):\n    \"\"\"Simple function for multi-start optimization.\"\"\"\n    # Run num_runs optimization runs from different starting points    \n    num_params = lower_bounds.shape[0]\n    theta_res = np.zeros((num_runs,num_params))\n    nll_res = np.zeros(num_runs)    \n    \n    for index in range(num_runs):\n        if index == 0:\n            theta0 = 0.5*(plausible_lower_bounds + plausible_upper_bounds)\n        else:\n            theta0 = np.random.uniform(low=plausible_lower_bounds,high=plausible_upper_bounds)    \n        bounds = sp.optimize.Bounds(lower_bounds,upper_bounds,True) # Set hard bounds\n        res = sp.optimize.minimize(target_fun, theta0, method='L-BFGS-B', bounds=bounds)\n        nll_res[index] = res.fun\n        theta_res[index] = res.x\n        \n    # Pick the best solution\n    idx_best = np.argmin(nll_res)\n    nll_best = nll_res[idx_best]\n    theta_best = theta_res[idx_best]        \n    return nll_best,theta_best\n```\n\n\n```python\nthetas = []\nnum_runs = 10\nnum_subjects = 6\n\nfor i in range(num_subjects):\n    subject = i + 1\n    s = np.array(df['Stimulus (ms)'][df['Subject id'] == subject])\n    r = np.array(df['Response (ms)'][df['Subject id'] == subject])\n    target_fun = lambda theta_: -idealgaussianobserverwithlapse_loglike(np.array(theta_),s,r)\n\n    # Define hard parameter bounds\n    lower_bounds = np.array([1.,1.,0.])\n    upper_bounds = np.array([2000.,2000.,1.])\n\n    # Define plausible range\n    plausible_lower_bounds = np.array([np.mean(s)*0.05,np.mean(s)*0.05,0.01])\n    plausible_upper_bounds = np.array([np.mean(s)*0.20,np.mean(s)*0.20,0.05])\n    \n    #call multioptimise  for subject\n    nll2_best,theta2_best = multioptimize(target_fun,lower_bounds,upper_bounds,plausible_lower_bounds,plausible_upper_bounds,num_runs)\n    thetas.append(theta2_best)\n\nprint(f\"Thetas (= sigma, sigma_motor, lambda): {thetas}\")\n```\n\n    Thetas (= sigma, sigma_motor, lambda): [array([109.596,  42.538,   0.   ]), array([68.835, 30.883,  0.005]), array([115.919,  71.987,   0.013]), array([139.914,  91.651,   0.   ]), array([63.395, 76.46 ,  0.008]), array([93.898, 73.26 ,  0.007])]\n    \n\n\n```python\n#thetas = [sigma, ]\nsigmas = [x[0] for x in thetas]\nsigma_motors = [x[1] for x in thetas]\nlambdas = [x[2] for x in thetas]\n\n#calculate mles and stds\nsigma_mle = np.mean(sigmas)\nsigma_motors_mle = np.mean(sigma_motors)\nlambdas_mle = np.mean(lambdas)\nsigma_std = np.std(sigmas, ddof=1)\nsigma_motors_std = np.std(sigma_motors, ddof=1)\nlambdas_std = np.std(lambdas, ddof=1)\n\nprint(f\"sigma:\\n mle = {sigma_mle},\\n std = {sigma_std}\")\nprint(f\"sigma motors:\\n mle = {sigma_motors_mle},\\n std = {sigma_motors_std}\")\nprint(f\"lambdas:\\n mle = {lambdas_mle},\\n std = {lambdas_std}\")\n```\n\n    sigma:\n     mle = 98.59291952413031,\n     std = 29.241153589295255\n    sigma motors:\n     mle = 64.46303211477215,\n     std = 22.91497518125363\n    lambdas:\n     mle = 0.005600439788615318,\n     std = 0.0050677241630049065\n    \n\n\n\n```python\ndef gaussianobserverwithlapse_loglike(theta,s_vec,r_vec):\n    \"\"\"Log-likelihood of ideal Gaussian observer with added lapse.\"\"\"\n    mu_prior = 787.5\n    sigma_prior = 128.1\n    sigma = theta[0]\n    sigma_motor = theta[1]\n    lapse_rate = theta[2]\n    lapse_pdf = 1/1500\n    mu_resp, sigma_resp = gaussian_response(s_vec,np.array((mu_prior,sigma_prior,sigma,sigma_motor)))\n    # First, compute log-likelihood without probability of lapse\n    loglike_vec = sps.norm.logpdf(r_vec,mu_resp,sigma_resp) # Vector of log-likelihood per trials\n    # Now, add the probability of lapse\n    if lapse_rate \u003e 0.:\n        likelihood_vec = np.exp(loglike_vec) # Exponentiate back to the likelihood\n        likelihood_with_lapse_vec = (1-lapse_rate)*likelihood_vec + lapse_rate*lapse_pdf\n        loglike_vec = np.log(likelihood_with_lapse_vec)\n        # This code snippet below uses the logsumexp trick, which is numerically more stable\n        # loglapse = np.log(lapse_rate*lapse_pdf)\n        # M = np.maximum(loglike, loglapse)\n        # loglike = np.log((1-lapse_rate)*np.exp(loglike-M) + np.exp(loglapse-M)) + M        \n    return np.sum(loglike_vec)\n```\n\n\n```python\ntarget_fun = lambda theta_: -gaussianobserverwithlapse_loglike(np.array(theta_),s,r)\nnll2_best,theta2_best = multioptimize(target_fun,lower_bounds,upper_bounds,plausible_lower_bounds,plausible_upper_bounds,num_runs=50)\nprint(theta2_best[:3])\n```\n\n    [98.093 64.117  0.008]\n    \n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadamelkholyy%2Fma-probcog","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadamelkholyy%2Fma-probcog","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadamelkholyy%2Fma-probcog/lists"}