All about DataSince, DataEngineering and ComputerScience
View the Project on GitHub datainsightat/DataScience_Examples
Pipelines process a certain amount of data an then exits.






Tranformation cannot be expressed in SQL. Use Dataflow as ETL Tool and land data in BigQuery.

Look Beyond Dataflow and BigQuery
| Issue | Solution | 
|---|---|
| Latency | Dataflow to Bigtable | 
| Spark | Dataproc | 
| Visual | Cloud Data Fusion | 


Metadata as a service.


| Bounded Data (Batch) | Unbounded Data (Stream) | 
|---|---|
| Finite data set | Infinite data set | 
| Complete | Never complete | 
| Time of element is disregarded | Time of element is siginificant | 
| in rest | in motion | 
| Durable storage | Temporary storage | 
| Data Integration (10sec - 10min) | Data decisions (100ms - 10sec) | 
|---|---|
| Data warehouse real-time | Real-time recommendations | 
| Fraud detection | |
| Gaming events | |
| Finance back office | 


