dalgot4d / ddp_backend Goto Github PK
View Code? Open in Web Editor NEWDjango app for the DDP platform
License: GNU Affero General Public License v3.0
Django app for the DDP platform
License: GNU Affero General Public License v3.0
Since we have only one destination warehouse, store it in the Org object
Could we write tests using https://pytest.org/
(Or if someone wants to recommend another testing framework that's fine too)
Pick a naming convention for functions, classes and variables
Update all existing code
fixing the login/signup flow
{"operations": ...} is accepted by the /web_backend/connections/create
not by /connections/create
!!
when starting a prefect flow, set the names of the flow / flow-run for easy lookup later
in ddpairbyte/functions.py
Use scripts/test-airbyte-api.py as a starting point, and write tests for every function in functions.py
We have endpoints to run flows (/prefect/flows/airbyte_sync/ and /prefect/flows/dbt_run/) but none to see an organization's flow history
Instead of taking dbtCredentials
from frontend we will prepare this in the backend. So that if add a new destination, we will only have to update the backend.
Store the token
Use when git-cloning (and git-pulling?)
I just committed
(This is the last commit I'm making to the master branch)
@Ishankoradia would you please refactor according to what we discussed last week i.e.
For our first version we will only support BigQuery and Postgres warehouses. The client's warehouse is the only Airbyte destination we support, and so the Airbyte destination configuration should only support BigQuery and Postgres
No secrets, no filesystem details
GitHub repo url
dbt target: name and target schema
Airbyte responses often include this HUGE icon
field... 100 lines of svg... remove them before logging
Right now we have endpoints like /createuser and /updateuser, replace with /user + HTTP verb
Also document this in the README
When a new Org is created, create the Airbyte workspace at the same time
Some of our API handlers take a while to execute; instead of risking a timeout we should run them separately and allow the client to check on their progress
For Sneha, we're using a custom connector which was updated by us. So basically there was a schema change in the API and we had to write and update the connector again. But the updated connector is not pushed to master so the Commcare which is currently there in the platform will not work and we don't want to use that.
Proposed solution -
We can build the new connector by building the docker container for that with the latest code but then you'll have to pull the new connector in your connector list and use that one to create the connection.
But again when the PR is pushed to master how we will update the existing connector? Something to think about
The DDP warehouse is used by Airbyte and by dbt. From Airbyte's point of view it is a destination and needs to be set up as one. For dbt it is a warehouse for which dbt requires credentials to be able to read from and write to it
When the user sets up their DDP warehouse we need to set up both their Airbyte destination as well as their dbt warehouse. For Postgres and BigQuery, Airbyte requires at least as much configuration information as dbt, and so we render the UI using Airbyte's destination specification to collect that information, part of which we then store in our db to be able to construct dbt's profiles.yml
When the frontend sends Airbyte connector creation requests, integers might come in as strings which Airbyte will reject
Check the payload against the schema and type cast when possible
The prefect-service's API to fetch a flow-run's logs takes an offset
parameter, but the Django application doesn't send a value
This enhancement simply takes an optional offset from the (frontend) client and forwards it to the prefect-service
Can we please create a dev_requirements.txt file here? Also, mention all the dependencies and their version number.
A new api that takes deploymentId
and create a flow run for this deployment
For dbt
, a profile
is
dbt_project.yml
target
s which appear in profiles.yml
For us, a ddp dbt profile
is
dbt profile
target
associated to a single target schema
target name
for the profiles.yml
(which we generate) ... since this is up to us we will just use the name of the target schema / datasetWe envision a user creating several ddp dbt profiles
in their workspace, one for production and others for testing and experimentation, each writing to a different target schema / dataset
When we create a ddp dbt profile
we create three Prefect blocks
dbt test
dbt run
dbt docs generate
named using
this is the traceback I get when I run the migration. It could be related to python version but I'm using 3.10
Traceback (most recent call last):
File "/Users/arun/Documents/DDP_backend/manage.py", line 22, in <module>
main()
File "/Users/arun/Documents/DDP_backend/manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/management/__init__.py", line 446, in execute_from_command_line
utility.execute()
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/management/__init__.py", line 440, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/management/base.py", line 402, in run_from_argv
self.execute(*args, **cmd_options)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/management/base.py", line 448, in execute
output = self.handle(*args, **options)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/management/base.py", line 96, in wrapped
res = handle_func(*args, **kwargs)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/management/commands/migrate.py", line 97, in handle
self.check(databases=[database])
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/management/base.py", line 475, in check
all_issues = checks.run_checks(
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/checks/registry.py", line 88, in run_checks
new_errors = check(app_configs=app_configs, databases=databases)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/checks/urls.py", line 14, in check_url_config
return check_resolver(resolver)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/core/checks/urls.py", line 24, in check_resolver
return check_method()
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/urls/resolvers.py", line 494, in check
for pattern in self.url_patterns:
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/utils/functional.py", line 57, in __get__
res = instance.__dict__[self.name] = self.func(instance)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/urls/resolvers.py", line 715, in url_patterns
patterns = getattr(self.urlconf_module, "urlpatterns", self.urlconf_module)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/utils/functional.py", line 57, in __get__
res = instance.__dict__[self.name] = self.func(instance)
File "/Users/arun/Documents/DDP_backend/venv/lib/python3.9/site-packages/django/urls/resolvers.py", line 708, in urlconf_module
return import_module(self.urlconf_name)
File "/Users/arun/.pyenv/versions/3.9.13/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/Users/arun/Documents/DDP_backend/ddpui/urls.py", line 5, in <module>
from ddpui.api.client.airbyte_api import airbyteapi
File "/Users/arun/Documents/DDP_backend/ddpui/api/client/airbyte_api.py", line 21, in <module>
from ddpui.ddpprefect.prefect_service import run_airbyte_connection_sync
File "/Users/arun/Documents/DDP_backend/ddpui/ddpprefect/prefect_service.py", line 19, in <module>
def get_airbyte_server_block_id(blockname) -> str | None:
TypeError: unsupported operand type(s) for |: 'type' and 'NoneType'
The PrefectService offers creation of four types of prefect blocks
Implement read / update / delete for these block types in PrefectService
Implement corresponding API endpoints in the ClientController
Make sure that a user can only touch their own organization's blocks! We track this using the OrgPrefectBlock table in our own DB
The AirbyteService offers creation and retrieval of sources, destinations and connections.
Implement Delete and Update
Create API endpoints to create Prefect jobs to
This is the error we get.
What we're able to do is. We can add the source for SurveyCTO and add the destination also. But sync is not working.
Traceback (most recent call last):
File "/home/ddp/DDP_backend/venv/lib/python3.10/site-packages/ninja/operation.py", line 104, in run
result = self.view_func(request, **values)
File "/home/ddp/DDP_backend/ddpui/api/client/airbyte_api.py", line 449, in post_airbyte_connection
airbyte_conn = airbyte_service.create_connection(org.airbyte_workspace_id, payload)
File "/home/ddp/DDP_backend/ddpui/ddpairbyte/airbyte_service.py", line 258, in create_connection
sourceschemacatalog = get_source_schema_catalog(
File "/home/ddp/DDP_backend/ddpui/ddpairbyte/airbyte_service.py", line 154, in get_source_schema_catalog
raise Exception("Failed to get source schema catalogs")
Exception: Failed to get source schema catalogs
in create_dbtcore_block, define the dbt_cli_profile for a bigquery target
When defining an Airbyte Connection, allow the override of the destination namespace ("schema" for Postgres, "dataset" for BigQuery)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.