All about DataSince, DataEngineering and ComputerScience
View the Project on GitHub datainsightat/DataScience_Examples
Pipelines process a certain amount of data an then exits.






Tranformation cannot be expressed in SQL. Use Dataflow as ETL Tool and land data in BigQuery.

Look Beyond Dataflow and BigQuery
| Issue | Solution |
|---|---|
| Latency | Dataflow to Bigtable |
| Spark | Dataproc |
| Visual | Cloud Data Fusion |


Metadata as a service.


| Bounded Data (Batch) | Unbounded Data (Stream) |
|---|---|
| Finite data set | Infinite data set |
| Complete | Never complete |
| Time of element is disregarded | Time of element is siginificant |
| in rest | in motion |
| Durable storage | Temporary storage |
| Data Integration (10sec - 10min) | Data decisions (100ms - 10sec) |
|---|---|
| Data warehouse real-time | Real-time recommendations |
| Fraud detection | |
| Gaming events | |
| Finance back office |


