Open Data and why it is important to you

March 31, 2014
5 min read
Introduction

About 25 years ago, Tim Berners-Lee designed the basics of what we know today as being World Wide Web. The WWW is a system of hypertext documents accessible on the Internet which are linked together via hyperlinks. This allows a user to navigate and access relevant information through links which point to other documents.

If you read this, you are also a user of Word Wide Web. Probably you are using it mostly for entertainment, for keeping contact with your friends and also for gathering information. But this information has to be presented in a concise way for you to understand it, to filter it and to get other relevant information. There’s where open data becomes important.

Accessing and reusing available information

The information on the web can be presented in many forms. Websites have different designs, structures and styles. Sometimes the information isn’t too well organized or maybe you would like to structure it or order it by certain characteristics. When you’re reading a webpage there isn’t too much to do in this case. Things are getting worse when you’re working with a PDF file or an office document file. Of course, it would be great if all this information could be displayed in a structured manner in a custom application where you could manipulate it as you want. But the information has to be somehow processed in order to be displayed in that application. There are several ways to do this:

Manual input of data. The least reliable way, as this takes a lot of time, requires human power which it is prone to mistakes.
Web scrapping. This means parsing a web page, extracting the required data and storing it in a proper way for later organizing. While this is faster than the manual input of data, a little change in the structure of the source webpage could break the parser and stop the data gathering.
Using an Application Program Interface (API). This is the best way of collecting data, as it can be done in a programmatically way, quickly and without worrying about webpage layout changes or copying mistakes.

Another important fact in reusing information from the web is its licensing terms. There are many licenses out there, more or less restrictive. One of the best known open license is Creative Commons licenses which allow creators to indicate which rights they want to reserve and which rights they waive for the benefit of recipients or other creators. This license applies not only to information data, but also to music, pictures or other artistic content.

Data disponibility

There are two primary ways in which information is stored, when it comes to WWW: 

  • on the web: data is accessible through world wide web, but in an “opaque” format, that is without referencing other related resources (e.g. PDF documents, DOC documents, OTD documents, etc.)
  • in the web: data is stored in an open and structured data format, with links and references to other related data or resources and also can be parsed and processed in a platform-agnostic way.
Five star Open Data
Five Start Steps

It's great when you can access information on the web without any restrictions, e.g. paid subscriptions. But sometimes this is not enough. Sometimes you would like to be able to alter that information, to organize it or to find other related information. Given that, we can categorize open data in five categories, by the format and the ease of manipulation available to us:

One Star Data: open-licensed data available in a format which can be viewed, printed, stored or shared, but it cannot be easily processed as it comes into an unstructured format (e.g. raster images or scanned documents).
Two Star Data: data is available in a structured way, but in a format which requires proprietary software to be viewed, edited or parsed. This also means it can be exported in another similar format.
Three Star Data: mostly the same case as the Two Star Data, but it can be viewed, edited or parsed without using a proprietary software. Still no hyperlinks to related data, so we can't make references or queries to it.
Four Star Data: now we're talking about “data in web”. It may contain URIs (Universal Resource Indicator) which are references to other related resources and can be shared on the Web. Parts of data can be also reused. Usually the references are identified via RDF (Resource Descriptive Framework), which is a World Wide Web Consortium standard.
Five Star Data: the information are interconnected. You can discover (more) related data while you are processing the data. Both the consumer and the publisher benefit from the network effect.

Sources of open data

The most sources of open data come from national governments which offer information about institutions, land borders, public procurement, activity reports, etc. It is important for governments to open their data to public for increasing their transparency and accountability. Also it helps developers to create applications which address public and private demands.

DBPedia is also a source of open data which is extracting content from the information created as part of the Wikipedia project. It allows users to query relationships and properties using an SQL-like query language called SPARQL, which is not easily possible just by scrapping Wikipedia's HTML webpages content.

Linking Open Data Diagram
Conclusions

Knowing the importance of open source software it's easy to also understand the importance of open data. For the usual user, open data means easier ways to get information, to find related information and also to replicate and improve that information. Keeping your data in an open format also helps you in easier locating, processing for later use and improving it.

Sources:

ASSIST Software Logo

Share on:

Want to stay on top of everything?

Get updates on industry developments and the software solutions we can now create for a smooth digital transformation.

* I read and understood the ASSIST Software website's terms of use and privacy policy.

Frequently Asked Questions

1. What is ASSIST Software's development process?  

The Software Development Life Cycle (SDLC) we employ defines the following stages for a software project. Our SDLC phases include planning, requirement gathering, product design, development, testing, deployment, and maintenance.

2. What software development methodology does ASSIST Software use?  

ASSIST Software primarily leverages Agile principles for flexibility and adaptability. This means we break down projects into smaller, manageable sprints, allowing continuous feedback and iteration throughout the development cycle. We also incorporate elements from other methodologies to increase efficiency as needed. For example, we use Scrum for project roles and collaboration, and Kanban boards to see workflow and manage tasks. As per the Waterfall approach, we emphasize precise planning and documentation during the initial stages.

3. I'm considering a custom application. Should I focus on a desktop, mobile or web app?  

We can offer software consultancy services to determine the type of software you need based on your specific requirements. Please explore what type of app development would suit your custom build product.   

  • A web application runs on a web browser and is accessible from any device with an internet connection. (e.g., online store, social media platform)   
  • Mobile app developers design applications mainly for smartphones and tablets, such as games and productivity tools. However, they can be extended to other devices, such as smartwatches.    
  • Desktop applications are installed directly on a computer (e.g., photo editing software, word processors).   
  • Enterprise software manages complex business functions within an organization (e.g., Customer Relationship Management (CRM), Enterprise Resource Planning (ERP)).

4. My software product is complex. Are you familiar with the Scaled Agile methodology?

We have been in the software engineering industry for 30 years. During this time, we have worked on bespoke software that needed creative thinking, innovation, and customized solutions. 

Scaled Agile refers to frameworks and practices that help large organizations adopt Agile methodologies. Traditional Agile is designed for small, self-organizing teams. Scaled Agile addresses the challenges of implementing Agile across multiple teams working on complex projects.  

SAFe provides a structured approach for aligning teams, coordinating work, and delivering value at scale. It focuses on collaboration, communication, and continuous delivery for optimal custom software development services. 

5. How do I choose the best collaboration model with ASSIST Software?  

We offer flexible models. Think about your project and see which models would be right for you.   

  • Dedicated Team: Ideal for complex, long-term projects requiring high continuity and collaboration.   
  • Team Augmentation: Perfect for short-term projects or existing teams needing additional expertise.   
  • Project-Based Model: Best for well-defined projects with clear deliverables and a fixed budget.   

Contact us to discuss the advantages and disadvantages of each model. 

ASSIST Software Team Members

See the past, present and future of tech through the eyes of an experienced Romanian custom software company. The ASSIST Insider newsletter highlights your path to digital transformation.

* I read and understood the ASSIST Software website's terms of use and privacy policy.

Follow us

© 2025 ASSIST Software. All rights reserved. Designed with love.