getting xgboost tweedie prediction from link

May 21, 2020

I’ve stumbled on something.. interesting. To get the prediction for a Tweedie GLM, we take the link value then do exp(link), but to get prediction from an xgboost tweedie , we take the “link” value then do exp(link)/ 2 , dividing the result by 2. Is this normal? Below is a quick demo showing how I get the predictions for a 3-trees xgboost and a glm. the code has been modified from the tweedie regression demo in the xgboost repository: https://github.

Local Covid19 cases: Canadian health regions and Montreal boroughs

May 10, 2020 in r, covic19

blo Quick post inspired by the winning / nearly there / need action graphs by @yaneerbaryam at https://www.endcoronavirus.org/countries. Data Health regions date is compiled by Isha Berry & friends github. Montreal boroughs data is published daily. They only keep the total and keep no history, so @bouchecl visits them every day and compiles the data in this google sheet Code I went a bit over the top for this one and created an R package you can install to recreate all the graphs and fetch the data.

How far are cyclists willing to go to use a cycling path? A good excuse to try out Graphhopper

April 13, 2020 in R, bike, map-maptching

I made a twitter survey a couple of months before the apocalypse to help me pick my next blog post topic and all 3 members of the crowd overwhelmingly agreed that I should use bike gps data and graphhopper to find out how far cyclists are willing to go to use safer infrastructure. This is awesome, because I had been looking for a use for this open data that contains GPS data for ~ 5000 bike trips in Montreal for a while.

Who did your neighbours vote for?

April 6, 2020

Canada Federal Election 2019

Tweedue and Recipes part 2 (kaggle data)

March 27, 2020

I just got my feet wet with tweedie regression and the recipes package yesterday. The results have been underwhelming, as the models didnt appear that predictive. I figured I might give it another try, this time using Kaggle’s claim prediction challenge from 2012. It is no longer possible to submit models, so we will create our own 20% test sample from the kaggle training data set and see how we fare.

Tweedie vs Poisson * Gamma

March 23, 2020 in R, insurnace

I’m building my first tweedie model, and I’m finally trying the {recipes} package. We will try to predict the pure premium of car insurance policy. This can be done directly with a tweedie model, or by multiplying two separates models: a frequency (Poisson) and a severity (Gamma) model. We wil be using “lift charts” and “double lift charts” to evaluate the model performance . Here’s is the plan: Pre-process the train and test data using recipes.

United States of they don't want no socialist healthcare system

December 14, 2019 in R, demography

I found life expectancy at birth data for “health regions” in Canada for 2015-2017 and in “census tracts” in the USA for 2010-2015. Here is a map of these two countries, excluding areas with a life expectancy at birth lower than 0. Data sources and shapefiles: Canada mortality. Canada shapefiles. USA mortality. USA shapefiles downloaded using the tigris package. Libraries The usual data wrangling libraries for spatial data (tidyverse, sf), mapped using mapview and leaflet.

getting xgboost tweedie prediction from link

Local Covid19 cases: Canadian health regions and Montreal boroughs

How far are cyclists willing to go to use a cycling path? A good excuse to try out Graphhopper

Who did your neighbours vote for?

Tweedue and Recipes part 2 (kaggle data)

Tweedie vs Poisson * Gamma

United States of they don't want no socialist healthcare system

Simon Coulombe