-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
all: fix grammar, typos, reformat tables #586
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# Deployment | ||
|
||
This guide is meant to give an high-level overview of deployment techniques and tips | ||
This guide is meant to give a high-level overview of deployment techniques and tips | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 hard "h" for "high" |
||
when planning how to deploy InvenioRDM. | ||
|
||
!!! info "Read the infrastructure architecture first!" | ||
|
@@ -67,15 +67,15 @@ scope of this documentation. | |
## Services | ||
|
||
When deploying InvenioRDM, you can choose to install, securely configure and | ||
maintain yourself the services (such as the database, the search engine, etc..) | ||
maintain yourself the services (such as the database, the search engine, etc.) | ||
services, or use third party providers. For example, the three cloud providers | ||
mentioned above (AWS, GC, and Azure) can provide most of the services. If you | ||
choose to deploy them yourself you will need to deep-dive and get experienced | ||
in the following topics: | ||
|
||
- Persist your data, and enable periodic backups. This includes the relational | ||
database, the search indices and the files. | ||
- Queue persistence (RabbitMQ) to avoid loosing tasks in case the service fails. | ||
- Queue persistence (RabbitMQ) to avoid losing tasks in case the service fails. | ||
- High availability, many of the services can be deployed redundantly. Note | ||
that most of the cloud providers offer this option or have an established SLA. | ||
- Secrets handling. Most of the services require credentials to connect. It is | ||
|
@@ -107,7 +107,7 @@ means that your application code will change and versioning will help you to | |
control which code is actually deployed. At CERN, we use GitHub tags/release | ||
for each version (e.g. v1.0.0). | ||
|
||
If you are using container for the deployment, you can automate the image | ||
If you are using containers for the deployment, you can automate the image | ||
build with for example [GitHub actions](https://github.com/features/actions) | ||
(or any other CI tool). In addition, some PaaS platforms have the capabilities | ||
to detect when a new image for a certain tag (e.g. `production`) has changed | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -23,11 +23,11 @@ projects, prior history and practices. | |||||
|
||||||
**Evolving** | ||||||
|
||||||
InvenioRDM is no different. The architecture is largely a by product our past | ||||||
InvenioRDM is no different. The architecture is largely a byproduct our past | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
and remove "we've faced" in next line as it's redundant. |
||||||
experiences and challenges we've faced. The architecture as described here, is | ||||||
not meant to be final answer, but rather an evolving architecture that adapts | ||||||
and improve over time. You also won't find the answer to all your question. As | ||||||
we work with the architecture, we identify short comings, missing things and concepts | ||||||
we work with the architecture, we identify shortcomings, missing things and concepts | ||||||
that could be better defined. | ||||||
|
||||||
**Past experiences and challenges** | ||||||
|
@@ -43,7 +43,7 @@ By no means have we solved all of these, and any software project out there is l | |||||
|
||||||
### Why not X? | ||||||
|
||||||
InvenioRDM is a monolith application using something as old as an relational | ||||||
InvenioRDM is a monolith application using something as old as a relational | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Grammar-wise this is probably the most correct. Tech discussions around this usually talk about the nouns directly ("monoliths"! "micro-services"!) rather than "x application". It does sound a little bit "harsher" because the reader spends more time on a bigger word that describes a bigger thing, so the effect is compounded. But that's just how it sounds. 🤷 |
||||||
database system, thus we sometimes get asked questions like why not use microservices, why not serverless | ||||||
and why not use NoSQL, so here's an attempt to give some vague answers. | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -90,7 +90,7 @@ are usually highly reliable as compared to some NoSQL solutions. | |
|
||
**Primary key lookups** | ||
|
||
Most access from Invenio to the database is via primary key look ups, which | ||
Most access from Invenio to the database is via primary key look-ups, which | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "lookups" is good too. I want to say "lookups" is more common in computer science lingo, but unsure. So this one is a take it or leave it. |
||
are usually very efficient in database. Search queries and the like are all | ||
sent to the search engine cluster which can provide much better performance | ||
than a database. | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -22,9 +22,9 @@ The record data model is used to describe **a resource**. Examples of resources | |||||
include e.g. journal articles, datasets, posters, videos, images, software and | ||||||
more. Some properties of resources: | ||||||
|
||||||
- A resource may exists in one or more versions. | ||||||
- A resource may exist in one or more versions. | ||||||
- A resource version has its own persistent identifiers and bibliographic | ||||||
metadata (e.g. title, publication date, creator list etc may be different | ||||||
metadata (e.g. title, publication date, creator list etc. may be different | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
between versions). | ||||||
- All versions of a resource should be accessible and editable. | ||||||
|
||||||
|
@@ -79,8 +79,8 @@ and save partial changes prior to being published. | |||||
|
||||||
There's three different types of drafts: | ||||||
|
||||||
- **New draft**: The initial draft when no record versions has been published. | ||||||
- **Edit draft**: A draft of an published record version (used when editing an | ||||||
- **New draft**: The initial draft when no record versions have been published. | ||||||
- **Edit draft**: A draft of a published record version (used when editing an | ||||||
already published record) | ||||||
- **Next draft**: A draft without a published record version that is not the | ||||||
initial draft. | ||||||
|
@@ -113,14 +113,14 @@ unordered child nodes. This means there is **no order** on record versions. | |||||
|
||||||
One record version is designated as the latest version tracked by this | ||||||
state so that it's possible to show only the latest version of a record. It | ||||||
also keep tracks of which is the latest draft/next draft if they exists. | ||||||
also keeps track of which is the latest draft/next draft if they exist. | ||||||
|
||||||
Lastly, an integer index is incremented every time a new record version is | ||||||
created. | ||||||
|
||||||
The reason that we do not define an order on record versions is because | ||||||
multiple different orders may be relevant. A version history may no | ||||||
necessarily be linear so it can make sense to order by version number, | ||||||
multiple different orders may be relevant. A version history may not | ||||||
necessarily be linear, so it can make sense to order by version number, | ||||||
publication date or order in which the record version was created. | ||||||
|
||||||
## Files | ||||||
|
@@ -208,7 +208,7 @@ allowed to be used as | |||||
The request itself follows the documented request states and transitions as | ||||||
documented under [requests](requests.md#statuses). | ||||||
|
||||||
The draft goes through it's own states as shown below: | ||||||
The draft goes through its own states as shown below: | ||||||
|
||||||
![Draft review state diagram](../img/review-states.svg) | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -166,7 +166,7 @@ build grant tokens both from: | |||||
- a required need: e.g. a request requiring needs to grant access. | ||||||
- a provided need: e.g. an identity providing needs. | ||||||
|
||||||
This provides an for efficient searches when you serialize the required grants | ||||||
This provides for efficient searches when you serialize the required grants | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
might as well make the tone less flowery and more direct 😸 |
||||||
into the indexed request, and when searching, you filter requests to the grants | ||||||
provided by the identity. | ||||||
|
||||||
|
@@ -194,7 +194,7 @@ We imagine that requests can later be extended with features such as: | |||||
|
||||||
A lot of the inspiration to the requests module comes from collaborative source code platforms like GitHub and GitLab and their pull/merge requests features. Other inspiration comes from user support systems like UserVoice. | ||||||
|
||||||
There are however notable differences between between pull/merge requests and requests in InvenioRDM. | ||||||
There are however notable differences between pull/merge requests and requests in InvenioRDM. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
Pull/merge requests is always made against a specific repository, and thus naturally *belong* to a given repository. This means that they are accessible on a single URL endpoint and permissions are conceptually somewhat easy to understand. | ||||||
|
||||||
Requests in InvenioRDM however makes sense from multiple endpoints depending on who is looking and the context they are doing it in. |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -10,7 +10,7 @@ The guide provides a high-level overview of the core software architecture of In | |||||
|
||||||
## Layers | ||||||
|
||||||
InvenioRDM has a layered architecture that consistent of three layers: | ||||||
InvenioRDM has a layered architecture that consists of three layers: | ||||||
|
||||||
- Presentation layer | ||||||
- Service layer | ||||||
|
@@ -24,13 +24,13 @@ The diagram below shows a simplified view of the data flow in the architecture. | |||||
|
||||||
![Architecture layers](../img/architecture.svg) | ||||||
|
||||||
*The presentation layer* parses incoming requests and routes them to service layer. This involves sending and receiving data in multiple different formats and translating these into an internal representation, as well as e.g. parsing arguments from an HTTP request (e.g parsing the query string parameters). | ||||||
*The presentation layer* parses incoming requests and routes them to service layer. This involves sending and receiving data in multiple different formats and translating these into an internal representation, as well as e.g. parsing arguments from an HTTP request (e.g. parsing the query string parameters). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Typically always have |
||||||
|
||||||
*The service layer* is completely independent from the presentation layer and can be used by many different presentation interfaces such as REST APIs, CLIs, Celery tasks. The service layer contains the overall control flow and is responsible for e.g. checking permissions and performing semantic data validation. | ||||||
|
||||||
*The data access layer* is responsible for ensuring data integrity, harmonizing data access to different storages as well as fetching and storing the data in the underlying systems. | ||||||
|
||||||
The data flow between the layers is strictly limited to some few well-defined objects to ensure a clean separation of concerns. The presentation layer communicates with the service layer via a e.g. a record projection (i.e. a view of a record localised to a specific identity). The service layer communicates with the data access layer via e.g. a record entity that provides data abstraction, syntactic data validation, and a strong programmatic API. | ||||||
The data flow between the layers is strictly limited to some few well-defined objects to ensure a clean separation of concerns. The presentation layer communicates with the service layer via e.g. a record projection (i.e. a view of a record localised to a specific identity). The service layer communicates with the data access layer via e.g. a record entity that provides data abstraction, syntactic data validation, and a strong programmatic API. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
This is a "Lars-ism" 😜 . Those are usually not an example but the actual thing itself. The words "projection" and "entity" are general enough to accommodate variations on that theme. |
||||||
|
||||||
!!! tip "Tip: Where do you belong?" | ||||||
|
||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just below is another fix to do: "where you code belongs" -> "where your code belongs" |
||||||
|
@@ -65,12 +65,12 @@ The data access layer serves two purposes: | |||||
|
||||||
- Provide a strong programmatic API that produce a clean, simple and reliable | ||||||
control flow in the service layer. | ||||||
- Persist our business objects on data storage in an reliable and performant | ||||||
- Persist our business objects on data storage in a reliable and performant | ||||||
way. | ||||||
|
||||||
!!! tip "Tip: Messy service layer?" | ||||||
|
||||||
If you service layer code looks messy, likely you need to work on your data | ||||||
If you service layer code looks messy, you may need to work on your data | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
access layer. | ||||||
|
||||||
A typical example is the service layer doing data-wrangling with | ||||||
|
@@ -137,7 +137,7 @@ The search mappings define how records are indexed and made searchable. Records | |||||
|
||||||
Dumpers are responsible for dumping and loading prior to storing/fetching records on secondary storage (e.g. the search index), and play a key role for harmonizing data access to records from primary and secondary storages. | ||||||
|
||||||
Dumpers are specific to a secondary storage system (e.g. an search dumper, a file dumper, ...). | ||||||
Dumpers are specific to a secondary storage system (e.g. a search dumper, a file dumper, ...). | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
The dump and load of a dumper MUST be idempotent - i.e. ``record == Record.load(record.dump())``. This ensures that independently of if a record was retrieved from primary or secondary storage, it has the same data and works in the same manner. | ||||||
|
||||||
|
@@ -341,7 +341,7 @@ The resource config are used for dependency injection to | |||||
|
||||||
## Performance considerations | ||||||
|
||||||
Performance is of very high importance for InvenioRDM. There's however often | ||||||
Performance is of very high importance for InvenioRDM. There are however often | ||||||
trade-offs to be made. | ||||||
|
||||||
**Query vs indexing speed** | ||||||
|
@@ -350,7 +350,7 @@ For InvenioRDM query speed is more important that fast indexing speeds. This mea | |||||
we'll sometimes denormalize data to have high enough query speed. Once we denormalize | ||||||
data we immediately must also deal with stale data and cache invalidation. | ||||||
|
||||||
The version counter on on all records is instrumental in being able to manage | ||||||
The version counter on all records is instrumental in being able to manage | ||||||
the speed. | ||||||
|
||||||
**Database vs search engine** | ||||||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -25,7 +25,7 @@ git add --patch | |||||
#### Linear history | ||||||
|
||||||
Our branches follow a linear commit history, meaning that | ||||||
we use *rebasing* instead of e.g. *merge commits*. In an nutshell this | ||||||
we use *rebasing* instead of e.g. *merge commits*. In a nutshell this | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
translates into: | ||||||
|
||||||
```console | ||||||
|
@@ -131,7 +131,7 @@ a third-party dependency must be careful evaluated if before being added. | |||||
|
||||||
Reviews are a very important part of the development process, but also has the potential to lead to conflicts among developers. | ||||||
|
||||||
Follow this guidelines to minimize the risk of conflicts: | ||||||
Follow these guidelines to minimize the risk of conflicts: | ||||||
|
||||||
#### Code of conduct | ||||||
|
||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,7 +17,7 @@ The guide covers general style guidelines. | |
- **Brand**: | ||
- The color used for theming your InvenioRDM. | ||
|
||
**Do's and don'ts:** | ||
**Dos and don'ts:** | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Different styles support different writings, but |
||
|
||
- ✅ Do style components with logical class names (like ``<div class="ui brand segment">``). | ||
- ❌ Don't style components with color names (e.g. ``<div class="ui blue segment">``). | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hehe this is a tricky one 😺 , but
an
is probably the right one. "a" / "an" is actually based on "sound" of consonant / vowel rather than based on actual consonant / vowels. And this is especially true for acronyms. So we write "an FBI file", because "eff-bee-eye" and we should write "an svg file" because "ess-vee-gee"See: https://www.merriam-webster.com/grammar/is-it-a-or-an