Erik on Software

Sign in Subscribe

Topic

data

A collection of 174 issues

Media Query Source: Part 48 - Techronicler (India digital magazine); Important factors to consider when making build-or-buy Gen AI decisions

* Techronicler (India digital magazine) * Important factors to consider when making build-or-buy Gen AI decisions * More than one build-or-buy viewpoint should be considered * Data security and maintainability are top of mind for me My responses ended up being included in an article at Techronicler (January 27, 2025). Extent of verbatim quote

Media Query Source: Part 47 - Reworked (US digital magazine); How Microsoft's Magentic-One benefits from being open source

* Reworked (US digital magazine) * How Microsoft's Magentic-One benefits from being open source * Its open source license is one of the most permissive * Strong communities lead to improvement and adoption The query responses I provided to Reworked on November 24, 2024: Reworked: How does Magentic-One's open-source nature

Media Query Source: Part 46 - CIO (US digital magazine); 10 key roles for AI success (2024 update)

* CIO (US digital magazine) * 10 key roles every AI team needs * Increased criticality of data engineer role * Data modeler role was missed in first piece My responses ended up being included in an article at CIO (October 17, 2024). Note that CIO overwrote the original June 7, 2022 article with

Community Comment: Part 36 - Data modelers need to understand patterns

* Data modelers need to understand patterns * Like programming, don't reinvent the wheel * Universal patterns & metapatterns exist * Remember "The Data Model Resource Book"

Community Comment: Part 34 - Popular data engines use Java / the JVM because it ruled the enterprise for many years

* Reasoning behind popular data engines & Java * Java / the JVM long ruled the enterprise * Spark, Flink, Presto & Trino are all Java-based * Remember: Presto & Trino share some history

Community Post: Part 5 - Databricks is the 2023 runner-up winner following acceptance of my proposal to include it in DB-Engines ranking

* DB-Engines ranking of DBMS products * Databricks climbs after my proposal to include * Databricks is runner-up winner for 2023 * Earlier rankings left out Databricks

Community Comment: Part 33 - With respect to Databricks vs Snowflake, actual usage is more interesting than account creation

* Snowflake vs Databricks account penetration * Account growth rates not divulged * Data can be easily misinterpreted * Databricks: AI advantage, closing in elsewhere

Community Comment: Part 32 - Citizen users of SQL need multiple skills

* Citizen users of SQL need multiple skills * Understand what tables & views needed * Generate correct output * Performantly generate correct output

Community Comment: Part 31 - Wholesale self-service analytics is a dumb goal

* Dumb goal: wholesale self-service analytics * Don't attempt a shotgun approach * Self-service users need to be defined * Identify & cater to power users

Product Reviews: Part 12 - Update with a note about data quality (be careful when using floating data types!)

Almost exactly a year has passed since publishing my last post in this series, and a lot has changed. So many changes have taken place, in fact, that I've decided to significantly decrease my contributions as an Amazon product reviewer, beginning at the conclusion of my second "