Data not showing from datalake

Joseph Y. Lee 5 Reputation points
2025-06-20T15:34:36.49+00:00

I'm building a dataflow to process data in our data lake, which is populated via a Dataverse link from Dynamics 365 into our Azure environment. When I query the data directly in the data lake, I can see all records, including the most recent updates. However, when I create a dataflow and set the source type to "Workspace DB" pointing to the same Dataverse-linked data lake, it only shows outdated records—specifically data from 2018—and none of the recent updates.

Azure Data Lake Analytics
{count} votes

1 answer

Sort by: Most helpful
  1. Chandra Boorla 15,455 Reputation points Microsoft External Staff Moderator
    2025-06-20T16:11:43.0066667+00:00

    @Joseph Y. Lee

    Thanks for reaching out and providing detailed context.

    From what you described, it sounds like your Dataflow is pointing to a "Workspace DB" source that may be referencing an outdated snapshot or cached version of your Dataverse-linked data. Since you're able to see the most recent records when querying the data lake directly, the issue likely lies in how the Dataflow source is configured or refreshed.

    Here are a few troubleshooting steps that might help you:

    Refresh Timing - The "Workspace DB" may not reflect real-time data and could rely on a scheduled refresh. Please verify when it was last updated.

    Source Path Configuration - Ensure the Dataflow is correctly targeting the current location of the Dataverse-exported data. Sometimes data gets partitioned or moved based on export rules, and the path used in "Workspace DB" might still be pointing to an older folder.

    Filters in Dataflow - Review the Dataflow for any filters or date-based constraints that might be unintentionally excluding newer records.

    Schema or Metadata Changes - If there were any recent schema changes in the Dataverse tables, it's possible that the Dataflow is referencing a structure that no longer aligns with the updated dataset.

    If possible, try configuring the Dataflow to read directly from the Parquet files in your Data Lake instead of using "Workspace DB" to confirm whether the issue is with the source layer.

    I hope this information helps. Please do let us know if you have any further queries.

    Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.

    Thank you.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.