Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Create split testing for multivariate #3235

Merged
merged 94 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from 17 commits
Commits
Show all changes
94 commits
Select commit Hold shift + click to select a range
5f30d88
Add SciPy
zachaysan Jan 3, 2024
4a661d0
Create AppConfig
zachaysan Jan 3, 2024
8e70834
Add helpers
zachaysan Jan 3, 2024
4bb7ed9
Add SplitTestPermissions
zachaysan Jan 3, 2024
0401010
Create SplitTest serializers
zachaysan Jan 3, 2024
5262709
Create split testing tasks
zachaysan Jan 3, 2024
9db92ab
Create SplitTest views
zachaysan Jan 3, 2024
3d9838c
Create SplitTest models and migration
zachaysan Jan 3, 2024
73e59fd
Add db schema for FeatureEvaluationRaw
zachaysan Jan 3, 2024
dabb6df
Add split testing to installed apps
zachaysan Jan 3, 2024
eaacb64
Add identifier and typing
zachaysan Jan 3, 2024
acad9e1
Create test_set_sdk_analytics_flags_with_identifier test
zachaysan Jan 3, 2024
ee9ac6c
Add identifier to track
zachaysan Jan 3, 2024
40f59fd
Add split testing routes
zachaysan Jan 3, 2024
c8727ee
Test split testing helpers
zachaysan Jan 3, 2024
8dd9bcc
Create split testing task test
zachaysan Jan 3, 2024
6515e1d
Create tests for split testing views
zachaysan Jan 3, 2024
486c73b
Change name to identity_identifier
zachaysan Jan 8, 2024
b3c9915
Create query serializer and add comment
zachaysan Jan 8, 2024
d031fcf
Remove label since its unnecessary
zachaysan Jan 8, 2024
588b2df
Restructure helpers and add docstrings
zachaysan Jan 8, 2024
88e5d44
Update to identity_identifier
zachaysan Jan 8, 2024
b8c94c5
Minor tweaks to tests with codebase change
zachaysan Jan 8, 2024
1b9be4a
Move class meta to below fields and remove statistic
zachaysan Jan 8, 2024
1b93551
Change to identity identifier and remove Optional
zachaysan Jan 8, 2024
68bb1e8
Add query serializer and avoid 500 error if the environment couldn't …
zachaysan Jan 8, 2024
fc70e00
Create query param serializer
zachaysan Jan 8, 2024
45d9cbb
Switch to bulk update instead of bulk delete and switch to identity i…
zachaysan Jan 8, 2024
08f41fe
Switch to mixins and use query serializer
zachaysan Jan 8, 2024
19f96e5
Fix conflicts and merge branch 'main' into feat/create_split_testing_…
zachaysan Jan 8, 2024
5ecf162
Add query params to mock
zachaysan Jan 9, 2024
930d00f
Add control feature for listing split tests
zachaysan Jan 9, 2024
830f6d8
Add split test creation and update tasks and add feature control for …
zachaysan Jan 9, 2024
6fad377
Make multivariate_feature_option nullable
zachaysan Jan 9, 2024
b53f906
Sort by nulls first on multivariate order by
zachaysan Jan 9, 2024
565e306
Split multivariate feature option into nullable field
zachaysan Jan 9, 2024
ba38b62
Multivariate feature option set to null
zachaysan Jan 9, 2024
713aaec
Split up into multiple tasks
zachaysan Jan 9, 2024
f6aac25
Randomness shows lower values than I expected
zachaysan Jan 9, 2024
0748f03
Add test for enabled when evaluated
zachaysan Jan 12, 2024
fc6cb7e
Update split testing views tests
zachaysan Jan 12, 2024
901121a
Add enabled when evaluated and conversion event types to task tests
zachaysan Jan 12, 2024
04189bf
Add enabled when evaluated to view
zachaysan Jan 12, 2024
a6c80c0
Add enabled when evaluated to track feature evaluation task
zachaysan Jan 12, 2024
3527e36
Add ConversionEventType and associated relations
zachaysan Jan 12, 2024
5c04fa8
Add enabled when evaluated to FeatureEvaluationRaw
zachaysan Jan 12, 2024
6235718
Add ConversionEventType view and add serializer context for split tests
zachaysan Jan 12, 2024
63d677a
Filter for split tests with enabled when evaluated and add CET
zachaysan Jan 12, 2024
7a20869
Add mv/fsv value_data response and add ConversionEventType serializers
zachaysan Jan 12, 2024
b5d8ecc
Change split test permission check and add ConversionEventTypePermiss…
zachaysan Jan 12, 2024
6b30732
Add enabled when evaluated
zachaysan Jan 12, 2024
b13a24f
Add ConversionEventTypeView path url
zachaysan Jan 12, 2024
4306cf7
Fix wording
zachaysan Jan 12, 2024
dcfb77c
Add a test for conversion events that preceed the feature from being …
zachaysan Jan 15, 2024
5906620
Ensure that conversions follow first feature evaluation
zachaysan Jan 15, 2024
6f90d66
Fix conflicts and merge branch 'main' into feat/create_split_testing_…
zachaysan Jan 15, 2024
52e6208
Change test to test for 202 and other minor fixes
zachaysan Jan 18, 2024
6a3fb7b
Switch to CreateAPIView and 202 for default response
zachaysan Jan 18, 2024
f817877
Fix wording
zachaysan Jan 18, 2024
91d21e1
Fix conflicts and merge branch 'main' into feat/create_split_testing_…
zachaysan Jan 18, 2024
554ced3
Tweak wording, typing, and add swagger_auto_schema for response
zachaysan Jan 18, 2024
ff00bca
Remove split testing logic from repo
zachaysan Jan 23, 2024
5d8f0a1
Remove scipy and numpy requirements from repo
zachaysan Jan 23, 2024
7cd1337
Remove split testing urls
zachaysan Jan 23, 2024
727872d
Remove split testing app from INSTALLED_APPS
zachaysan Jan 23, 2024
67dfb73
Attempt #1 to run analytics tasks in CI
zachaysan Jan 23, 2024
ae058d4
Make reason a kwarg
zachaysan Jan 23, 2024
5d87fb7
Add default to test list
zachaysan Jan 23, 2024
4c8c9ac
Fix broken asserts
zachaysan Jan 23, 2024
df5ae38
Try re-enabling postgres for the test since it's on the wrong path
zachaysan Jan 23, 2024
a6c8a56
Trigger Build
zachaysan Jan 23, 2024
97427e6
Trigger build
zachaysan Jan 23, 2024
e47ab72
Trigger build
zachaysan Jan 23, 2024
412e2c7
Add split testing settings
matthewelwell Jan 24, 2024
e916398
Re-add split test views conditionally
zachaysan Jan 24, 2024
63b78c9
Manually fix poetry.lock from main
zachaysan Jan 29, 2024
ab9b35a
Merge branch 'main' into feat/create_split_testing_for_multivariate
zachaysan Jan 29, 2024
2d63586
Switch view tests to v2 endpoint
zachaysan Jan 29, 2024
6db5620
Create v2 urls
zachaysan Jan 29, 2024
dc032da
Add v2 urls to urls
zachaysan Jan 29, 2024
49e7231
Create v2 version of sdk analytics views
zachaysan Jan 29, 2024
80d038e
Create serializers for v2 sdk analytics
zachaysan Jan 29, 2024
e8f7da6
Add track_feature_evaluation_v2 to tasks
zachaysan Jan 29, 2024
2cf853b
Add v2 mirror for influxdb records
zachaysan Jan 29, 2024
006066b
Local test
zachaysan Jan 30, 2024
7edfae6
Update sdk tests with identifier and tasks
zachaysan Jan 31, 2024
bbeb4a2
Required set to False for field
zachaysan Jan 31, 2024
b121ac4
Set to use a task
zachaysan Jan 31, 2024
541850a
Fix HTTP Status value and use tasks for sdk processing with influx
zachaysan Jan 31, 2024
4e39665
Merge branch 'main' into feat/create_split_testing_for_multivariate
zachaysan Jan 31, 2024
1149ae2
Add throttle classes to fix broken tests
zachaysan Jan 31, 2024
e8e8c41
Merge branch 'main' into feat/create_split_testing_for_multivariate
zachaysan Jan 31, 2024
cf10217
Trigger build
zachaysan Jan 31, 2024
7764318
Add v2 endpoint
zachaysan Jan 31, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 17 additions & 2 deletions api/api/urls/v1.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
from app_analytics.split_testing.views import (
ConversionEventViewSet,
SplitTestViewSet,
)
from app_analytics.views import SDKAnalyticsFlags, SelfHostedTelemetryAPIView
from django.conf.urls import url
from django.urls import include
Expand Down Expand Up @@ -27,6 +31,12 @@
traits_router = routers.DefaultRouter()
traits_router.register(r"", SDKTraits, basename="sdk-traits")

split_testing_router = routers.DefaultRouter()
split_testing_router.register(
r"conversion-events", ConversionEventViewSet, basename="conversion-events"
)
split_testing_router.register(r"", SplitTestViewSet, basename="split-tests")

app_name = "v1"

urlpatterns = [
Expand All @@ -47,8 +57,13 @@
url(r"^flags/$", SDKFeatureStates.as_view(), name="flags"),
url(r"^identities/$", SDKIdentities.as_view(), name="sdk-identities"),
url(r"^traits/", include(traits_router.urls), name="traits"),
url(r"^analytics/flags/$", SDKAnalyticsFlags.as_view()),
url(r"^analytics/telemetry/$", SelfHostedTelemetryAPIView.as_view()),
url(r"^split-testing/", include(split_testing_router.urls), name="split-testing"),
url(r"^analytics/flags/$", SDKAnalyticsFlags.as_view(), name="analytics-flags"),
url(
r"^analytics/telemetry/$",
SelfHostedTelemetryAPIView.as_view(),
name="analytics-telemetry",
),
url(
r"^environment-document/$",
SDKEnvironmentAPIView.as_view(),
Expand Down
1 change: 1 addition & 0 deletions api/app/settings/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -164,6 +164,7 @@
"softdelete",
"metadata",
"app_analytics",
"app_analytics.split_testing",
]

SITE_ID = 1
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Generated by Django 3.2.23 on 2024-01-02 16:35

from django.db import migrations, models
from core.migration_helpers import PostgresOnlyRunSQL


class Migration(migrations.Migration):

atomic = False

dependencies = [
('app_analytics', '0001_initial'),
]

operations = [
migrations.AddField(
model_name='featureevaluationraw',
name='identifier',
field=models.CharField(default=None, max_length=2000, null=True),
),
migrations.SeparateDatabaseAndState(
state_operations=[
migrations.AlterField(
model_name='featureevaluationraw',
name='feature_name',
field=models.CharField(db_index=True, max_length=2000),
),
],
database_operations=[
PostgresOnlyRunSQL(
'CREATE INDEX CONCURRENTLY "app_analytics_featureevaluationraw_feature_name_idx" ON "app_analytics_featureevaluationraw" ("feature_name");',
reverse_sql='DROP INDEX CONCURRENTLY "app_analytics_featureevaluationraw_feature_name_idx";',
)
],
),
]
5 changes: 4 additions & 1 deletion api/app_analytics/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,11 +73,14 @@ def check_overlapping_buckets(self):


class FeatureEvaluationRaw(models.Model):
feature_name = models.CharField(max_length=2000)
feature_name = models.CharField(db_index=True, max_length=2000)
environment_id = models.PositiveIntegerField()
evaluation_count = models.IntegerField(default=0)
created_at = models.DateTimeField(auto_now_add=True)

# Identity identifier stored for tracking multivariate split testing.
identifier = models.CharField(max_length=2000, null=True, default=None)


class FeatureEvaluationBucket(AbstractBucket):
feature_name = models.CharField(max_length=2000)
Expand Down
6 changes: 6 additions & 0 deletions api/app_analytics/split_testing/apps.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
from django.apps import AppConfig


class AppAnalyticsConfig(AppConfig):
name = "app_analytics.split_testing"
label = "app_analytics_split_testing"
36 changes: 36 additions & 0 deletions api/app_analytics/split_testing/helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
import numpy as np
from scipy.stats import chi2_contingency


def analyse_split_test(observed_matrix: np.array) -> tuple[float, float]:
# Replace zero values in order for the chi-squared results can
# be fully calculated. Don't worry about false results since
# the pvalue will be much too low to matter to the user.
replacement_value = 1
observed_matrix = np.where(observed_matrix == 0, replacement_value, observed_matrix)

# Calculate the results with correction set to `True` and the
# lambda set to what is commonly known as the G-Test.
results = chi2_contingency(
observed_matrix,
correction=True,
lambda_="log-likelihood",
)

# Return the most important result, the pvalue, as well as a
# possibly useful statistic addition for the frontend.
# Typically a pvalue of around 1% is ideal, though as large
# as 5% is acceptable for some tests.
return results.pvalue, results.statistic


def gather_split_test_metrics(
evaluation_counts: dict[int, int], conversion_counts: dict[int, int]
) -> tuple[float, float]:
_evaluation_counts = []
_conversion_counts = []
for mv_feature_option_id, evaluation_count in evaluation_counts.items():
_evaluation_counts.append(evaluation_count)
_conversion_counts.append(conversion_counts[mv_feature_option_id])
input_data = np.array([_conversion_counts, _evaluation_counts])
return analyse_split_test(input_data)
48 changes: 48 additions & 0 deletions api/app_analytics/split_testing/migrations/0001_initial.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Generated by Django 3.2.23 on 2024-01-03 18:50

from django.db import migrations, models
import django.db.models.deletion


class Migration(migrations.Migration):

initial = True

dependencies = [
('environments', '0033_add_environment_feature_state_version_logic'),
('identities', '0002_alter_identity_index_together'),
('features', '0062_alter_feature_segment_unique_together'),
('multivariate', '0007_alter_boolean_values'),
]

operations = [
migrations.CreateModel(
name='SplitTest',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('evaluation_count', models.PositiveIntegerField()),
('conversion_count', models.PositiveIntegerField()),
('pvalue', models.FloatField()),
('statistic', models.FloatField()),
('created_at', models.DateTimeField(auto_now_add=True, null=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('environment', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='split_tests', to='environments.environment')),
('feature', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='split_tests', to='features.feature')),
('multivariate_feature_option', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='multivariate.multivariatefeatureoption')),
],
),
migrations.CreateModel(
name='ConversionEvent',
fields=[
('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
('created_at', models.DateTimeField(auto_now_add=True, null=True)),
('updated_at', models.DateTimeField(auto_now=True)),
('environment', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='conversion_events', to='environments.environment')),
('identity', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, related_name='conversion_events', to='identities.identity')),
],
),
migrations.AddConstraint(
model_name='splittest',
constraint=models.UniqueConstraint(fields=('environment', 'feature', 'multivariate_feature_option'), name='unique_environment_feature_mvfo'),
),
]
Empty file.
55 changes: 55 additions & 0 deletions api/app_analytics/split_testing/models.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
from django.db import models

from environments.identities.models import Identity
from environments.models import Environment
from features.models import Feature
from features.multivariate.models import MultivariateFeatureOption


class ConversionEvent(models.Model):
environment = models.ForeignKey(
Environment, related_name="conversion_events", on_delete=models.CASCADE
)
identity = models.ForeignKey(
Identity,
related_name="conversion_events",
on_delete=models.CASCADE,
)

created_at = models.DateTimeField(null=True, auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)


class SplitTest(models.Model):
class Meta:
constraints = [
models.UniqueConstraint(
fields=["environment", "feature", "multivariate_feature_option"],
name="unique_environment_feature_mvfo",
)
]

environment = models.ForeignKey(
Environment, related_name="split_tests", on_delete=models.CASCADE
)
feature = models.ForeignKey(
Feature, related_name="split_tests", on_delete=models.CASCADE
)
multivariate_feature_option = models.ForeignKey(
MultivariateFeatureOption, on_delete=models.CASCADE
)

# Populated from the existing split testing tasks.py to the
# number of unique identifiers for a single feature /
# environment combination. Multiple occurences ignored.
evaluation_count = models.PositiveIntegerField()
# from the ConversionEvent model for matching identifiers.
conversion_count = models.PositiveIntegerField()

# Split test metrics, where the pvalue is the most useful.
# See the analyse_split_test helpers function for more details.
pvalue = models.FloatField(null=False)
statistic = models.FloatField(null=False)

created_at = models.DateTimeField(null=True, auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
23 changes: 23 additions & 0 deletions api/app_analytics/split_testing/permissions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
from rest_framework.permissions import IsAuthenticated
from rest_framework.request import Request
from rest_framework.viewsets import ModelViewSet

from environments.models import Environment
from environments.permissions.constants import VIEW_ENVIRONMENT


class SplitTestPermissions(IsAuthenticated):
def has_permission(self, request: Request, view: ModelViewSet) -> bool:
if not super().has_permission(request, view):
return False

environment_id = request.query_params.get("environment_id")

if not environment_id:
return False

environment = Environment.objects.get(id=environment_id)

return request.user.has_environment_permission(
permission=VIEW_ENVIRONMENT, environment=environment
)
40 changes: 40 additions & 0 deletions api/app_analytics/split_testing/serializers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
from rest_framework import serializers

from environments.identities.models import Identity
from features.multivariate.serializers import (
NestedMultivariateFeatureOptionSerializer,
)
from features.serializers import FeatureSerializer

from .models import ConversionEvent, SplitTest


class ConversionEventSerializer(serializers.Serializer):
identity_identifier = serializers.CharField(required=True)

def save(self, *args, **kwargs) -> ConversionEvent:
environment = self.context["request"].environment
identity = Identity.objects.get(
environment=environment,
identifier=self.validated_data["identity_identifier"],
)
return ConversionEvent.objects.create(
environment=environment,
identity=identity,
)


class SplitTestSerializer(serializers.ModelSerializer):
feature = FeatureSerializer()
multivariate_feature_option = NestedMultivariateFeatureOptionSerializer()

class Meta:
model = SplitTest
fields = (
"feature",
"multivariate_feature_option",
"evaluation_count",
"conversion_count",
"pvalue",
"statistic",
)
Loading