Skip to content

Save and Update Nodes to API

nh916 edited this page Aug 9, 2023 · 8 revisions

Save

Construct Giant JSON

We create a giant JSON from our Project node to send the all the nodes to the API in one request. This is an important step because it guards against bad data.

SDK History

In the previous SDK, nodes had to be saved one at a time, this created issues because if half way through the script the user got an error, then within the database half of the project would have been uploaded and half would not have been. This becomes a bigger issue for update when if the script has a bug half way through upload, then half of the nodes are updated and half are outdated creating bad data in total.

With the new SDK we wanted it to work in such ways that either all the nodes get saved to the API or if there is an error while the API is parsing through it, then it rejects the entire request, does not save anything, and gives back an error to be corrected, therefore, always having valid data within the database.

Giant JSON with UUID

We construct a giant JSON of a project node. We are using UUID to point to nodes that we have within the JSON. The UUID helps to reduce the JSON file size by referring to UUID of the node instead of inputting the full node into every part of the JSON.

Nodes repeating can increase the JSON size very quickly. In the begining the JSON size was 8M+ lines

Giant JSON example

The database schema enforces the use of UUID and guards against repeat of nodes because before the API does any kind of processing with the JSON, it will first check the JSON against the DB schema and if it fails then it will return the error as a response. Thus, the SDK has to send some nodes as full nodes and some UUID references to the node. For example, inventory is written so that it will always only accept UUID references when sending the Project JSON and will expect the materials to be already defined on the project.

The API also expects UUID to have already been saved and they should point to an already saved node within the API.

I believe UID is supposed to be used as a reference when the node is new

The SDK does not know which node has been in the DB and which needs to be new unless we traverse the entire nodes and search every single node one by one to find which one is new and old.

SDK Save Solution

The way the SDK handles this is by getting feedback from the API, learning from the errors, correcting the errors, and saving the nodes

  1. Construct the giant JSON and try sending a POST request to the API
    1. If the API gives back an HTTP 200, then everything is fine and we can move on ✅
    2. if the API responds with HTTP 400 {Bad UUID Error} then that means a node has been condensed a node to UUID that does not yet exist on the API

      save the condensed UUID node to the API, then try sending the giant JSON again

Code for Condensing node to UUID

    # core.py
    def get_json(
        self,
        handled_ids: Optional[Set[str]] = None,
        known_uuid: Optional[Set[str]] = None,
        suppress_attributes: Optional[Dict[str, Set[str]]] = None,
        is_patch=False,
        condense_to_uuid={
            "Material": ["parent_material", "component"],
            "Inventory": ["material"],
            "Ingredient": ["material"],
            "Property": ["component"],
            "ComputationProcess": ["material"],
            "Data": ["material"],
            "Process": ["product", "waste"],
            "Project": ["member", "admin"],
            "Collection": ["member", "admin"],
        },
        **kwargs
    ):

Bad UUID API Error Solution Steps

  1. Find the Bad UUID node by its UUID from the Project tree
  2. Save the full node to the API
  3. remove all occurrences of the full node from the Project tree and only refer to the saved node by its UUID because it has already been saved to the API
  4. repeat this process until there are no Bad UUID API errors and everything has been saved


Update

Computation

Input ✅

can update computation.notes

{
   "node":["Computation"],
   "updated_at":"2023-07-12T02:44:13.942095Z",
   "model_version":"1.0.0",
   "name":"my computation name",
   "type":"data_fit",
   "notes": "computation notes UPDATED"
}

Input ✅

can update computation.notes with software

{
    "node": ["Computation"],
    "updated_by": {
        "node": ["User"],
        "uid": "_:0x15f901"
    },
    "updated_at": "2023-07-12T02:57:39.943587Z",
    "model_version": "1.0.0",
    "name": "my computation name",
    "type": "analysis",
    "notes": "computation > software_configuration > software UPDATED",
    "software_configuration": [
        {"uuid": "37c2223a-164a-4ef7-b01f-4f6876fdf877"}
    ]
}

Input ✅

can update computation.notes

and has software_config with software

{
    "node": [
        "Computation"
    ],
    "updated_by": {
        "node": ["User"],
        "uid": "_:0x15f901"
    },
    "updated_at": "2023-07-12T02:57:39.943587Z",
    "model_version": "1.0.0",
    "name": "my computation name",
    "type": "analysis",
    "notes": "these here are UPDATED NOTES BABY AGAINNNNN!!!!",
    "software_configuration": [
        {
            "uuid": "37c2223a-164a-4ef7-b01f-4f6876fdf877"
        }
    ]
}

Steps

  1. isolate the node that you want to update and remove all the outer layer things

    example: if you want to update experiment, 
    isolate it and remove outer layers of collection and project
    
  2. strip 'created_at', 'created_by', 'uid', 'uuid' from JSON

    1. Otherwise, API will respond with: Additional properties are not allowed ('created_at', 'created_by', 'uid', 'uuid' were unexpected) at path: /
  3. send request to get feedback on your updated

  4. if you get a Duplicate uuid: 5aa3f648-f27b-4478-a81c-fd64965e87bb provided

    I think some other errors might be solved with this as well, it solves most errors I think

    1. find the node and reduce it to just UUID

      1. From
      "software":{
         "node":["Software"],
         "uid":"_:0x196205",
         "uuid":"5aa3f648-f27b-4478-a81c-fd64965e87bb"
      }

      To

      "software": {
      	"uuid": "5aa3f648-f27b-4478-a81c-fd64965e87bb"
      }
    2. if this error pops up that it is not valid

      is not valid under any of the given schemas 
      at path: properties/software_configuration/items"
      
      {
        "uuid": "37c2223a-164a-4ef7-b01f-4f6876fdf877",
        "software": {
          "uuid": "5aa3f648-f27b-4478-a81c-fd64965e87bb"
        },
        "model_version": "1.0.0",
        "updated_at": "now()",
        "created_at": "now()"
      }
      1. then remove that node from the JSON and save it separately

Quick Notes

see if node exists
if yes
	send patch
		if error
			find the node within the graph and send that to the API
				try to post again
					if error repeat the process
if not found
	post project
		if bad uuid
			find the uuid that is bad from the graph
				post that
					get a 200 response
						post the project again
							if error, then repeat the process

Fields that must be removed for PATCH

REMOVE_ATTRIBUTES = [
    "uid",
    "uuid",
    "public",
    "locked",
    "model_version",
    "created_at",
    "updated_at",
    "created_by",
    "updated_by",
]

Recommendation

  • When working on integration write out the HTTP request payload by hand
  • use Postman software to send HTTP response to the API and get the response
  • design the step by step logic on paper, then attempt to code it