Skip to content

Commit

Permalink
minor changes
Browse files Browse the repository at this point in the history
  • Loading branch information
ana-vranic committed Nov 20, 2023
1 parent d573375 commit 7ce2135
Show file tree
Hide file tree
Showing 3 changed files with 17 additions and 31 deletions.
2 changes: 1 addition & 1 deletion docs/MapReduce.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ In the first example, we will implement word count.
The mapper is defined as following:

```py title="map.py"
!/usr/bin/env python3
#!/usr/bin/env python3
import sys
import re

Expand Down
44 changes: 15 additions & 29 deletions docs/Spark.md
Original file line number Diff line number Diff line change
Expand Up @@ -205,9 +205,12 @@ Then we initialize the PageRank od each player.
Nodes = (xs.keys() + xs.values()).distinct()
ranks = Nodes.map(lambda x: (x, 100))
sorted(ranks.collect())
--------------------------
[('player1', 1.0), ('player2', 1.0), ('player3', 1.0), ('player4', 1.0)]

```

```
--------------------------
[('player1', 100.0), ('player2', 100.0), ('player3', 100.0), ('player4', 100.0)]
```
Here we join lists of lost games and PageRank into tupple. As links and ranks have same keys, we can use that by calling ```join()``` function.
```python
Expand Down Expand Up @@ -436,7 +439,6 @@ only showing top 20 rows
df.filter(df['species']=='setosa').show()

```

```
+------------+-----------+------------+-----------+-------+
|sepal_length|sepal_width|petal_length|petal_width|species|
Expand All @@ -463,6 +465,7 @@ df.filter(df['species']=='setosa').show()
| 5.1| 3.8| 1.5| 0.3| setosa|
+------------+-----------+------------+-----------+-------+
only showing top 20 rows
```

```py
Expand All @@ -471,44 +474,27 @@ df.groupBy('species').sum().show()
```

```
+------------+-----------+------------+-----------+-------+
|sepal_length|sepal_width|petal_length|petal_width|species|
+------------+-----------+------------+-----------+-------+
| 5.1| 3.5| 1.4| 0.2| setosa|
| 4.9| 3.0| 1.4| 0.2| setosa|
| 4.7| 3.2| 1.3| 0.2| setosa|
| 4.6| 3.1| 1.5| 0.2| setosa|
| 5.0| 3.6| 1.4| 0.2| setosa|
| 5.4| 3.9| 1.7| 0.4| setosa|
| 4.6| 3.4| 1.4| 0.3| setosa|
| 5.0| 3.4| 1.5| 0.2| setosa|
| 4.4| 2.9| 1.4| 0.2| setosa|
| 4.9| 3.1| 1.5| 0.1| setosa|
| 5.4| 3.7| 1.5| 0.2| setosa|
| 4.8| 3.4| 1.6| 0.2| setosa|
| 4.8| 3.0| 1.4| 0.1| setosa|
| 4.3| 3.0| 1.1| 0.1| setosa|
| 5.8| 4.0| 1.2| 0.2| setosa|
| 5.7| 4.4| 1.5| 0.4| setosa|
| 5.4| 3.9| 1.3| 0.4| setosa|
| 5.1| 3.5| 1.4| 0.3| setosa|
| 5.7| 3.8| 1.7| 0.3| setosa|
| 5.1| 3.8| 1.5| 0.3| setosa|
+------------+-----------+------------+-----------+-------+
only showing top 20 rows
+----------+------------------+------------------+------------------+------------------+
| species| sum(sepal_length)| sum(sepal_width)| sum(petal_length)| sum(petal_width)|
+----------+------------------+------------------+------------------+------------------+
| virginica| 329.3999999999999| 148.7|277.59999999999997|101.29999999999998|
|versicolor| 296.8|138.50000000000003|212.99999999999997| 66.3|
| setosa|250.29999999999998|170.90000000000003| 73.2|12.199999999999996|
+----------+------------------+------------------+------------------+------------------+
```
```py
df.groupBy('species').max().show()
```

```
+----------+-----------------+----------------+-----------------+----------------+
++----------+-----------------+----------------+-----------------+----------------+
| species|max(sepal_length)|max(sepal_width)|max(petal_length)|max(petal_width)|
+----------+-----------------+----------------+-----------------+----------------+
| virginica| 7.9| 3.8| 6.9| 2.5|
|versicolor| 7.0| 3.4| 5.1| 1.8|
| setosa| 5.8| 4.4| 1.9| 0.6|
+----------+-----------------+----------------+-----------------+----------------+
```
**Clustering**

Expand Down
2 changes: 1 addition & 1 deletion docs/tutorial.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ The dataset contains `.csv files` with WTA matches from 1968 until 2023.



## Requirement
## Requirements

- python3
- mrjob ```pip install mrjob```
Expand Down

0 comments on commit 7ce2135

Please sign in to comment.