Close Menu

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Argentina’s hot spot for Antarctic cruises insists it didn’t cause the hantavirus outbreak

    WHO head seeks to reassure residents of Spanish island where hantavirus-stricken ship is headed

    Northwestern Lands Commitment from QB RJ Day, Son of Ohio State’s Ryan Day

    Facebook X (Twitter) Instagram
    Facebook X (Twitter) Instagram Pinterest VKontakte
    Sg Latest NewsSg Latest News
    • Home
    • Politics
    • Business
    • Technology
    • Entertainment
    • Health
    • Sports
    Sg Latest NewsSg Latest News
    Home»Technology»Podcast: How to get value from unstructured data
    Technology

    Podcast: How to get value from unstructured data

    AdminBy AdminNo Comments8 Mins Read
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Share
    Facebook Twitter LinkedIn Pinterest Email


    We talk to Nasuni founder and chief technology officer (CTO) Andres Rodriguez about the characteristics needed from storage to make optimal use of unstructured data in the enterprise, as well as the challenge of its scale.

    He says the cloud has changed everything, with the cloud model of working providing a blueprint for a single pool of storage accessible from anywhere.

    He also says enterprises need to classify, tag and curate data to build rich metadata that can boost corporate knowledge of and access to data, as well as to access it for artificial intelligence (AI), such as via Model Context Protocol (MCP) connectors. 

    What is the nature of the obstacles to optimal use of unstructured data in the enterprise?

    It really is all about scale. I mean, if you go back to what unstructured data is, it’s all of the files in the file servers, the NAS [network-attached storage], etc.

    It’s all of that work product. So, if you are an architecture firm, it’s design drawings. If you’re a manufacturing firm, it’s design drawings and simulations. All of that ends up in the files, in the file systems of the enterprise.



    And in every organisation, in addition to that, there’s the classic office documents – Excel and PowerPoints and Word documents and PDFs. Those are generic across all industries. And so, you end up with this sort of huge potential repository that could be mined to add value to the organisation.

    But the challenge is, how do you access it? How do you control access to it at the same time that you can access it? And then, how do you plug it into the tools that are going to give you insights into that data? And doing that at scale is a really formidable challenge.

    So, what do customers need from the way unstructured data is stored so that they can gain as much insight from it as possible?

    The first thing is there’s so much of it in organisations that what ends up happening with traditional approaches is you end up with lots of silos of data. You know, the data gets stored in devices, the devices are all over the place, etc.

    If it’s a large organisation, there could be different geographic locations where employees are located, and they need high-performance access to files in those locations. So you end up building silos for those.

    It could just be capacity. You run out of capacity in one file server, so you deploy another one and another one, and you end up with this incredible number of file servers. So, when you look to do things that are valuable with the data, you realise that it’s become impossible because the data is in so many different silos, and it’s hard to get to the silos and aggregate them in any sort of logical way.

    The cloud changed all that. Many organisations, especially large organisations that have consolidated their unstructured data, their file data, into the cloud, have realised this enormous gain, which is that the data is now consolidated in one logical space that is infinitely scalable, and it’s available at very high levels of performance from anywhere in the world.

    The cloud is infinite and the cloud is everywhere. And so, that is an incredible foundational piece for them to be able to tap into that data repository, that unstructured data repository, and gather insights from the data.

    What technologies underpin the optimal use of unstructured data for customers, especially in this era of AI?

    I think there are several pieces.

    At the foundational level, you want technology that allows for NAS consolidation. One of our specialties is to provide that sort of NAS, enabled with the cloud, that gives you scale and high performance anywhere you want it. That’s the first building block.

    Then, on top of that block, you need to have unstructured data management tools that allow you to take that enormous repository and do it right at scale.

    For everything I’m talking about, you’re fighting a scale headwind, so you need to have the technology that allows you to get to hundreds of millions or billions of files and petabytes of storage, otherwise, you’re going to end up being crippled in your efforts by the sheer scale of the problem.

    So, in this next layer of unstructured data management, you want to have very scalable tools that allow you to classify data, tag data, set access controls at a global level for the data – in other words, curate the data.

    I mean, if you look at what people are trying to do now with AI and gaining insights from AI, the failure of most of those projects can be attributed to a lack of sufficient quality data going into the LLMs [large language models]. In engineering school, they used to teach us, you put garbage into a model, you get garbage out of a model.

    The first priority is to clean up the data that’s going into your models. This means tools that allow you to do that at scale with the regular unstructured data that your organisation is producing, so that as the organisation continues to evolve, that dataset is updated automatically.

    Not because you’re doing some special kind of lift and effort, but because you’ve already set up the pipelines and all the systems are automatically cleaning up the data and making the data available to the machine learning models.

    That’s how you get a system that doesn’t just work once when you’re running the project, but adds insights to the organisation on an ongoing basis.

    And so, the last layer is this sort of general-purpose plug-in into all of the available LLM models. There isn’t going to be a single one that’s going to meet all your needs.

    You need to have a sort of hub that allows you to connect. The term people are using now is the MCP interfaces that give you standard access to different models. That sort of standardisation at the level of the models is crucial because the dataset isn’t going to change.

    I mean, it’s going to change when workers change, but it isn’t going to change based on what model you’re using. You have to be able to plug in whatever model is best suited to the goal you’re trying to achieve.

    And if it doesn’t work, or if you want an upgrade, or if you want to switch vendors, you need to be able to change that. It’s what we call late binding, and later in the project, you need to be able to make that decision.

    And then, of course, you need to close the loop and see through some sort of interface reporting – things like Tableau – the insights you’re getting from the data.

    What our clients typically want to do is look at project data and estimate, is this project going to be on time? Is it going to be on budget based on signals coming from the unstructured data?

    Or you want to be able to do compliance at a higher level of knowledge. Perhaps you want to understand not just what’s in the files, but how end users interact with those files, how those files have changed over time. That can give you enormous insights into the behaviour of your unstructured data, and how your organisation is using or not using that data.

    So, it’s really about the integration of those three layers; the foundational NAS consolidation or unstructured data consolidation layer, which is all about storage and making sure the data is protected, making sure you have capacity and high performance. Then above that is an unstructured data management layer that allows you to curate the data and prepare it so that you make it available to the third layer, which is the interface to all the machine learning models.

    I guess the curation and classification layer part of things is all about the metadata. Would that be the case?

    That is correct.

    Sometimes you can harness the data to come up with metadata, but the rules are always based on metadata.

    So, the idea is you have to have a rich structure. This is why that first layer, the NAS consolidation, is so important.

    It’s because you need a rich structure in your file system that allows you to annotate your data with new metadata to allow for rules to be set based on that metadata that controls the curation, the behaviour of the unstructured data.

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Admin
    • Website

    Related Posts

    Microsoft reveals why some Windows 11 updates take ages to install

    The new Wild West of AI kids’ toys

    Denon Home series speakers review: Siri & superior sound

    Google settles racial discrimination lawsuit for $50 million

    Add A Comment
    Leave A Reply Cancel Reply

    Editors Picks

    Electrical fire to keep theater that hosts ‘The Book of Mormon’ closed through May 17

    The 2026 Grammy Award nominations are about be announced. Here’s what to know

    Disease of 1,000 faces shows how science is tackling immunity’s dark side

    Judge reverses Trump administration’s cuts of billions of dollars to Harvard University

    Top Reviews
    9.1

    Review: Mi 10 Mobile with Qualcomm Snapdragon 870 Mobile Platform

    By Admin
    8.9

    Comparison of Mobile Phone Providers: 4G Connectivity & Speed

    By Admin
    8.9

    Which LED Lights for Nail Salon Safe? Comparison of Major Brands

    By Admin
    Sg Latest News
    Facebook X (Twitter) Instagram Pinterest Vimeo YouTube
    • Get In Touch
    © 2026 SglatestNews. All rights reserved.

    Type above and press Enter to search. Press Esc to cancel.