The package RMixpanel provides an interface from R to Mixpanel’s API endpoints (see https://mixpanel.com/help/reference/data-export-api and https://mixpanel.com/help/reference/exporting-raw-data). For the most frequently used API endpoints (segmentation, retention, funnel, engage, export, etc.) custom methods make the parameterization more convenient and do the conversion from JSON to a corresponding R data.frame or R matrix. Furthermore it is possible to update or delete user profiles.
mixpanelGetData).segmentation/: get the segmentation matrix using
mixpanelGetSegmentation.retention/: get the retention matrix using
mixpanelGetRetention.addiction/: get the addiction matrix using
mixpanelGetAddiction.funnel/: get funnel data using
mixpanelGetFunnel.engage/:
mixpanelGetProfiles.mixpanelUpdateProfile.stream/query/: get events of selected people profiles
using mixpanelGetEventsForProfiles.export/: get event data as R matrix using
mixpanelGetEvents.jql/: perform custom queries using
mixpanelJQLQuery.mixpanelGetProfilesCount.export/. This allows
querying data for long time spans using multiple requests.require(devtools)
devtools::install_github("ploner/RMixpanel")
require(RMixpanel)or
install.packages("RMixpanel")
require(RMixpanel)The package depends on - jsonlite - uuid - RCurl - base64enc
In order to use the various methods of this package, we need to save
the account data of the Mixpanel Project into an R object of class
mixpanelAccount. The next examples all make use of this
account object.
## Fill in here the API token, key and secret as found on
## www.mixpanel.com - Account -> Projects.
> account = mixpanelCreateAccount("ProjectName",
token="c12f3...",
secret="167e7e...",
key="553c55...")
> class(account)
[1] "mixpanelAccount"> retentions <- mixpanelGetRetention(account, born_event="AppInstall", event="WatchedItem",
from=20150701, to=20151101, unit="week")
> print(retentions)
## Example output:
## Retention Matrix
## Row names are Cohort Start Dates. Column names are Periods (0 -> 0 to 1 units)
## Count 0 1 2 3 ... ...
## 2015-06-29 17 94.11765 29.411765 29.411765 29.411765 ...
## 2015-07-06 38 100.00000 31.578947 18.421050 ...
...> mixpanelGetProfilesCount(account, where='properties["KPI1"] > 1.32')
## Example output:
## 21987 Given the people profiles have two properties named KPI1 and KPI2,
the following lines of code will load these properties for all profiles
matching the query KPI1 >= 1.32 and fill an R data.frame
with the corresponding data. The hist method could be used
to generate a histogram of one of the KPI’s.
More complex queries including logical operators and typecasts can be generated using the syntax described on Mixpanel’s documentation.
> profiles = mixpanelGetProfiles(account, where='properties["KPI1"] > 1.32',
select=c("KPI1", "KPI2"))
> print(profiles)
## Example output:
## distinct_id KPI1 KPI2
## [1,] "D1FED2..." 1.37 1.09
## [2,] "4441C5..." 2.11 -0.12
## ...
> hist(as.numeric(profiles[, "KPI1"]))Remove property KPI1 when the value is larger than
1000:
> profiles = mixpanelGetProfiles(account, where='properties["KPI1"] > 1000')
> distinctIDs = profiles[, "distinct_id"]
> for (distinctID in distinctIDs)
> mixpanelUpdateProfile(account, distinctID, data=list("$unset"="KPI1"))Delete all profiles where KPI1 is not set:
> profiles = mixpanelGetProfiles(account, where='not properties["KPI1"]')
> distinctIDs = profiles[, "distinct_id"]
> for (distinctID in distinctIDs)
> mixpanelUpdateProfile(account, distinctID, data=list("$delete"=""))Add a random value between 1 and 10 called bucket to all
people profiles:
> profiles = mixpanelGetProfiles(account)
> distinctIDs = profiles[, "distinct_id"]
> for (distinctID in distinctIDs)
> mixpanelUpdateProfile(account, distinctID,
data=list("$set"=list(bucket=jsonlite::unbox(sample(10, 1)))))The general method mixpanelGetData allows to call all
available API endpoints of the export API. However, the result is not
parsed into R objects. Calling jsonlite::fromJSON(data) on
the resulting data would do some parsing, but usually more
postprocessing is needed.
Here an example without transforming the resulting JSON into handy R objects:
## Get list of funnels.
> mixpanelGetData(account, method="funnels/list/", args=list(), data=TRUE)
## Example output:
## [1] "[{\"funnel_id\": 1011888, \"name\": \"My first funnel\"},
## {\"funnel_id\": 1027999, \"name\": \"User journey funnel\"}]"
## Get data about a certain funnel.
> mixpanelGetData(account, method = "funnels/", args = list(funnel_id="1027999", unit="week"),
data = TRUE)
## Example output:
## [1] "{\"meta\": {\"dates\": [\"2015-11-04\", \"2015-11-11\"]},
## \"data\": {\"2015-11-11\":
## {\"steps\": [
## {\"count\": 7777, \"step_conv_ratio\": 1, \"goal\": \"AppInstall\", \"overall_conv_ratio\":1,
## \"avg_time\": null, \"event\": \"AppInstall\"},
## {\"count\": 555, \"avg_time\": 111, \"goal\": \"OpenedView\", \"overall_conv_ratio\": 0.77,
## \"selector\": \"(properties[\\\"status\\\"] == \\\"loggedin\\)\",
## \"step_conv_ratio\": 0.06964335860713283, \"event\": \"OpenedView\"},
## {\"count\": 333, \"avg_time\": 222, ...
## ...The JQL Query language opens a wide spectrum of possibilities. As a simple example we extract the event count per user (‘distinct_id’). The Mixpanel JQL API Reference can be found on https://mixpanel.com/help/reference/jql/api-reference.
jqlQuery <- '
function main() {
return Events({
from_date: "2016-01-01",
to_date: "2016-12-31"
})
.groupByUser(mixpanel.reducer.count())
}'
res <- mixpanelJQLQuery(account, jqlQuery,
columnNames=c("distinctID", "Count"), toNumeric=2)
hist(res$Count)Here we show how to calculate the metric Daily Active Users (DAU) when the user ID is different from the distinct_id. First write the JQL query and save it into a file named jqlDAU.js:
function today(addDays) {
var day = new Date();
day.setDate(day.getDate() + (addDays || 0));
return day.toISOString().substr(0, 10);
}
function main() {
return Events({
from_date: today(dayFrom),
to_date: today(dayTo)
})
.groupBy(["properties.UserID", getDay], function(count, events) {
count = count || 0;
return count + events.length;
})
.groupBy(["key.1"], mixpanel.reducer.count());
}The parameters
mixpanelJQLQuery(account, jqlString="dayFrom=-7; dayTo=-1;", jqlScripts="jqlDAU.js")