Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database schema without foreign keys may introduce inconsistent data #70

Open
Piiit opened this issue Jan 19, 2021 · 8 comments
Open

Database schema without foreign keys may introduce inconsistent data #70

Piiit opened this issue Jan 19, 2021 · 8 comments
Assignees
Labels
bug Something isn't working data quality prj:beacon
Milestone

Comments

@Piiit
Copy link
Contributor

Piiit commented Jan 19, 2021

At the moment we have some orphans inside "order_data", which do not have a beacon assigned inside "beacon_data" (production environment). And on the testing environment we have test beacons that do not have any order_data assigned. This happened because there are no constraints of the database tables with foreign keys. We should add them to avoid such problems and to understand how tables are connected.

@Piiit Piiit added bug Something isn't working data quality labels Jan 19, 2021
@Piiit
Copy link
Contributor Author

Piiit commented Jan 19, 2021

@davidebz @noctho I tried the following to find all orphans:

Query to find all order_data records which do not have a beacon assigned:

select * from order_data 
where info_id not in (
	select b.id
	from order_data od 
	join beacon_data b on b.id = od.info_id
)

@Piiit
Copy link
Contributor Author

Piiit commented Jan 19, 2021

@sseppi FYI

@noctho
Copy link
Collaborator

noctho commented Jan 20, 2021

I tried on a old dump of the databese the following query:

select * from info 
where id not in (
	select id from beacon_data
)

There I received exactly the same result (same ids) and additionally the records of:

select * from info 
where id not in (
	select id from order_data
)
order by id;

The resulting info record have the following values as name:

  • "JEN0011#ZIDdHG"
  • "SUD0031"
  • "SUD0029"
  • "Frei- DEFEKT!"
  • "SUD0029"
  • "KIE0008#KqBw87 musste gegen LAU0010#AW7kKr ausgetauscht werden"
  • "SUD0031"
  • "SUD0030"
  • "SUD0030"

I investigated a bit and found out, that before we eliminated the google spreadsheet, every time a new line was found in the sheet (line with empty id) an info record was created with an id that was generated randomly. Random iBeacon values and EddystoneBeacon values were generated too.

After the creation of the info record, an order_data record was also created, which contains the id of info as id and info_id as well as the position in the sheet. But there were no checks that a beacon_data with this ID also existed. I have not found any other use of the order_data, except for the OrderController. But I haven't found out what it is used for.

Is it possible that it was used to fetch all beacons from the system that are planned (or requested) but not yet ordered? To make a bulk order?

Based on this analysis, it is very plausible that order_data records existed without a corresponding beacon_data record and also info without beacon_data. After the two tables were fused, these info records were lost. If I now execute the following query, I get exactly the record from the list above.

select *
from info
where id not in (select id from beacon_data)

Anyway, since the last recent changes, the order_data is no longer used, so I would suggest dropping the table. I will also analyse the database to see if any foreign keys are missing and create them.

@Piiit
Copy link
Contributor Author

Piiit commented Jan 21, 2021

@sseppi and I had a discussion, and we found out that this table has no longer interesting information for us. We should replace it with the following columns: "order_id" (<-- PK) and "created_at", and then put a foreign key "order_id" from the table beacon_data to this order_id...

Since we eliminated the spreadsheet import, we should also remove the remainders inside the db...

@davidebz @noctho What do you think?

@noctho
Copy link
Collaborator

noctho commented Jan 21, 2021

What should contain the replaced order_data columns? Should the order_id be the order_id of kontakt.io? In this case the content of the table would be completely changed.

At the moment, as I analysed it, order_symbol is the "ID" of an order, but this is just an internally generated identifier. "created_at" of the current table is the time stamp when the record was created.

So I think if you want to memorize the orders from kontakt.io, then it is a new functionality and we have to recover the data of the existing beacons.

@Piiit
Copy link
Contributor Author

Piiit commented Jan 21, 2021

Yes, that would be a complete new approach, that is, a reuse of that table. If we have time to do it, would be nice... otherwise, if it is too late before our meeting, we could move this into a future development cycle.

@davidebz
Copy link
Collaborator

davidebz commented Jan 22, 2021

Hi Peter, I agree it is a nice activity, i like foreign key and database constraints in general, but I think is too far from the list of activity we have to do in this development cycle and moreover too short time to make an estimations of the required time.

@Piiit
Copy link
Contributor Author

Piiit commented Jan 22, 2021

@davidebz That is good... I move it to a "future" milestone then... eventually I will do it, if time allows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working data quality prj:beacon
Projects
None yet
Development

No branches or pull requests

3 participants