Technology Listing (Week 1, Feb 2020)


1. DBpedia

DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets. Tim Berners-Lee described DBpedia as one of the most famous parts of the decentralized Linked Data effort.

Background 
The project was started by people at the Free University of Berlin and Leipzig University, in collaboration with OpenLink Software, and the first publicly available dataset was published in 2007. It is made available under free licences (CC-BY-SA), allowing others to reuse the dataset; it doesn't however use an open data license to waive the sui generis database rights.

Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles, such as "infobox" tables (the pull-out panels that appear in the top right of the default view of many Wikipedia articles, or at the start of the mobile versions), categorisation information, images, geo-coordinates and links to external Web pages. This structured information is extracted and put in a uniform dataset which can be queried.

Dataset 
The 2016-04 release of the DBpedia data set describes 6.0 million entities, out of which 5.2 million are classified in a consistent ontology, including 1.5M persons, 810k places, 135k music albums, 106k films, 20k video games, 275k organizations, 301k species and 5k diseases. DBpedia uses the Resource Description Framework (RDF) to represent extracted information and consists of 9.5 billion RDF triples, of which 1.3 billion were extracted from the English edition of Wikipedia and 5.0 billion from other language editions.

From this data set, information spread across multiple pages can be extracted. For example, book authorship can be put together from pages about the work, or the author.

One of the challenges in extracting information from Wikipedia is that the same concepts can be expressed using different parameters in infobox and other templates, such as |birthplace= and |placeofbirth=. Because of this, queries about where people were born would have to search for both of these properties in order to get more complete results. As a result, the DBpedia Mapping Language has been developed to help in mapping these properties to an ontology while reducing the number of synonyms. Due to the large diversity of infoboxes and properties in use on Wikipedia, the process of developing and improving these mappings has been opened to public contributions.

Version 2014 was released in September 2014. A main change since previous versions was the way abstract texts were extracted. Specifically, running a local mirror of Wikipedia and retrieving rendered abstracts from it made extracted texts considerably cleaner. Also, a new data set extracted from Wikimedia Commons was introduced.

Nowadays, DBpedia is one of the biggest representatives of Linked Open Data (LOD).

2. Freebase

Freebase was a large collaborative knowledge base consisting of data composed mainly by its community members. It was an online collection of structured data harvested from many sources, including individual, user-submitted wiki contributions. Freebase aimed to create a global resource that allowed people (and machines) to access common information more effectively. It was developed by the American software company Metaweb and ran publicly beginning in March 2007. Metaweb was acquired by Google in a private sale announced 16 July 2010. Google's Knowledge Graph was powered in part by Freebase.

Freebase data was available for commercial and non-commercial use under a Creative Commons Attribution License, and an open API, RDF endpoint, and a database dump was provided for programmers.

On 16 December 2014, Knowledge Graph announced that it would shut down Freebase over the succeeding six months and help with the move of the data from Freebase to Wikidata.

On 16 December 2015, Google officially announced the Knowledge Graph API, which is meant to be a replacement to the Freebase API. Freebase.com was officially shut down on 2 May 2016.

On 8 of September 2018 Google has published at github.com sources of graphd server, which is a Freebase backend.

Type of site: Online database
Available in: English
Owner: Metaweb Technologies (Google)
Website: www.freebase.com
Commercial: No
Registration: Optional
Launched: 3 March 2007; 12 years ago
Current status: Offline (since 2 May 2016), succeeded by Wikidata
Content license: Creative Commons Attribution License

3. YAGO (database)

YAGO (Yet Another Great Ontology) is an open source knowledge base developed at the Max Planck Institute for Computer Science in Saarbrücken. It is automatically extracted from Wikipedia and other sources.

As of 2019, YAGO3 has knowledge of more than 10 million entities and contains more than 120 million facts about these entities. The information in YAGO is extracted from Wikipedia (e.g., categories, redirects, infoboxes), WordNet (e.g., synsets, hyponymy), and GeoNames. The accuracy of YAGO was manually evaluated to be above 95% on a sample of facts. To integrate it to the linked data cloud, YAGO has been linked to the DBpedia ontology and to the SUMO ontology.

YAGO3 is provided in Turtle and tsv formats. Dumps of the whole database are available, as well as thematic and specialized dumps. It can also be queried through various online browsers and through a SPARQL endpoint hosted by OpenLink Software. The source code of YAGO3 is available on GitHub.

YAGO has been used in the Watson artificial intelligence system.

Developer(s): Max-Planck-Institute Saarbrücken
Initial release: 2008
Stable release: 3.1 / June 6, 2017
Repository: github.com/yago-naga/yago3
Type: Semantic Web, Linked Data
License: Creative Commons CC-BY 3.0
Website: www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/

Ref: https://en.wikipedia.org/wiki/YAGO_(database)

4. Neo4j

Neo4j is a graph database management system developed by Neo4j, Inc. Described by its developers as an ACID-compliant transactional database with native graph storage and processing, Neo4j is the most popular graph database according to DB-Engines ranking, and the 22nd most popular database overall.

Neo4j is available in a GPL3-licensed open-source "community edition", with online backup and high availability extensions licensed under a closed-source commercial license. Neo also licenses Neo4j with these extensions under closed-source commercial terms.

Neo4j is implemented in Java and accessible from software written in other languages using the Cypher Query Language through a transactional HTTP endpoint, or through the binary "bolt" protocol.

Data structure 
In Neo4j, everything is stored in the form of an edge, node, or attribute. Each node and edge can have any number of attributes. Both nodes and edges can be labelled. Labels can be used to narrow searches. As of version 2.0, indexing was added to Cypher with the introduction of schemas. Previously, indexes were supported separately from Cypher.

In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties of database transactions intended to guarantee validity even in the event of errors, power failures, etc. In the context of databases, a sequence of database operations that satisfies the ACID properties (and these can be perceived as a single logical operation on the data) is called a transaction. For example, a transfer of funds from one bank account to another, even involving multiple changes such as debiting one account and crediting another, is a single transaction.

Atomicity guarantees that each transaction is treated as a single "unit", which either succeeds completely, or fails completely.

Consistency ensures that a transaction can only bring the database from one valid state to another, maintaining database invariants: any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This prevents database corruption by an illegal transaction, but does not guarantee that a transaction is correct. Referential integrity guarantees the primary key – foreign key relationship.

Isolation: Transactions are often executed concurrently (e.g., multiple transactions reading and writing to a table at the same time). Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially. Isolation is the main goal of concurrency control; depending on the method used, the effects of an incomplete transaction might not even be visible to other transactions. 

Durability guarantees that once a transaction has been committed, it will remain committed even in the case of a system failure (e.g., power outage or crash). This usually means that completed transactions (or their effects) are recorded in non-volatile memory.

Ref 1: https://en.wikipedia.org/wiki/Neo4j
Ref 2: https://en.wikipedia.org/wiki/ACID

5. NetworkX

NetworkX is a Python library for studying graphs and networks. NetworkX is free software released under the BSD-new license.

Features 
- Classes for graphs and digraphs.
- Conversion of graphs to and from several formats.
- Ability to construct random graphs or construct them incrementally.
- Ability to find subgraphs, cliques, k-cores.
- Explore adjacency, degree, diameter, radius, center, betweenness, etc.
- Draw networks in 2D and 3D.

Suitability 
NetworkX is suitable for operation on large real-world graphs: e.g., graphs in excess of 10 million nodes and 100 million edges. Due to its dependence on a pure-Python "dictionary of dictionary" data structure, NetworkX is a reasonably efficient, very scalable, highly portable framework for network and social network analysis.

Integration 
NetworkX is integrated into SageMath.

Ref: https://en.wikipedia.org/wiki/NetworkX

6. Notepad++

Notepad++ is a text editor and source code editor for use with Microsoft Windows. It supports tabbed editing, which allows working with multiple open files in a single window. The project's name comes from the C increment operator.

Notepad++ is distributed as free software. At first the project was hosted on SourceForge.net, from where it has been downloaded over 28 million times, and twice won the SourceForge Community Choice Award for Best Developer Tool. The project was hosted on TuxFamily from 2010 to 2015; since 2015 Notepad++ has been hosted on GitHub. Notepad++ uses the Scintilla editor component.

Features 
Notepad++ is a source code editor. It features syntax highlighting, code folding and limited autocompletion for programming, scripting, and markup languages, but not intelligent code completion or syntax checking. As such it may properly highlight code written in a supported schema but whether the syntax is internally sound or compilable cannot be verified. 

Notepad++ has features for consuming and creating cross-platform plain text files. It recognizes three newline representations (CR, CR+LF and LF) and can convert between them on the fly. In addition, it supports reinterpreting plain text files in various character encodings and can convert them to ASCII, UTF-8 or UCS-2. As such, it can fix plain text that seem gibberish only because their character encoding is not properly detected.

Notepad++ also has features that improve plain text editing experience in general, such as:

- Autosave
- Finding and replacing strings of text with regular expressions
- Guided indentation
- Line bookmarking
- Macros
- Simultaneous editing
- Split screen editing and synchronized scrolling
- Line operations, including sorting, case conversion (Uppercase, lowercase, camel case, sentence case), and removal of redundant whitespace
- Tabbed document interface

Now, we have Notepad++ on Ubuntu as well! 
Install Notepad++ Using Ubuntu GUI:

1. Go to "Ubuntu Software" (this is the application with the icon of an orange handbag with letter "A" on it).
2. When the Ubuntu Software application opens, click on the search icon on the top right corner of its window.
3. A search bar will appear, type notepad++. Once you find the application, click on it.
4. Now click on Install to start the installation of the Notepad-plus-plus application.
5. Note: Notepad++ on Ubuntu has been made possible through WINE.

Ref 1: https://en.wikipedia.org/wiki/Notepad%2B%2B
Ref 2: https://vitux.com/how-to-install-notepad-on-ubuntu/

7. Wine (software)

Wine (recursive backronym for Wine Is Not an Emulator) is a free and open-source compatibility layer that aims to allow computer programs (application software and computer games) developed for Microsoft Windows to run on Unix-like operating systems. Wine also provides a software library, known as Winelib, against which developers can compile Windows applications to help port them to Unix-like systems.

Wine provides its own Windows runtime environment which translates Windows system calls into POSIX-compliant system calls, recreating the directory structure of Windows systems, and providing alternative implementations of Windows system libraries, system services through wineserver and various other components (such as Internet Explorer, the Windows Registry Editor, and msiexec). Wine is predominantly written using black-box testing reverse-engineering, to avoid copyright issues.

The selection of "Wine is Not an Emulator" as the name of the Wine Project was the result of a naming discussion in August 1993 and credited to David Niemi. There is some confusion caused by an early FAQ using Windows Emulator and other invalid sources that appear after the Wine Project name being set. No code emulation or virtualization occurs when running a Windows application under Wine. "Emulation" usually would refer to execution of compiled code intended for one processor (such as x86) by interpreting/recompiling software running on a different processor (such as PowerPC). While the name sometimes appears in the forms WINE and wine, the project developers have agreed to standardize on the form Wine.

Wine is primarily developed for Linux and macOS, and there are, as of November 2018, well-maintained packages available for both platforms.

In a 2007 survey by desktoplinux.com of 38,500 Linux desktop users, 31.5% of respondents reported using Wine to run Windows applications. This plurality was larger than all x86 virtualization programs combined, as well as larger than the 27.9% who reported not running Windows applications.

Design 
The goal of Wine is to implement the Windows APIs fully or partially that are required by programs that the users of Wine wish to run on top of a Unix-like system.

# Basic architecture
The programming interface of Microsoft Windows consists largely of dynamic-link libraries (DLLs). These contain a huge number of wrapper sub-routines for the system calls of the kernel, the NTOS kernel-mode program (ntoskrnl.exe). A typical Windows program calls some Windows DLLs, which in turn calls user-mode gdi/user32 libraries, which in turn uses the kernel32.dll (win32 subsystem) responsible for dealing with the kernel through system calls. The system-call layer is considered private to Microsoft programmers as documentation is not publicly available, and published interfaces all rely on subsystems running on top of the kernel. Besides these, there are a number of programming interfaces implemented as services that run as separate processes. Applications communicate with user-mode services through RPCs.

Wine implements the Windows application binary interface (ABI) entirely in user space, rather than as a kernel module. Wine mostly mirrors the hierarchy, with services normally provided by the kernel in Windows instead provided by a daemon known as the wineserver, whose task is to implement basic Windows functionality, as well as integration with the X Window System, and translation of signals into native Windows exceptions. Although Wineserver implements some aspects of the Windows kernel, it is not possible to use native Windows drivers with it, due to Wine's underlying architecture. This prevents certain applications and games from working, for example those using StarForce copy-protection which requires virtual device drivers to be installed.

# Libraries and applications
Wine allows for loading both Windows DLLs and Unix shared objects for its Windows programs. Its built in implementation of the most basic Windows DLLs, namely NTDLL, KERNEL32, GDI32, USER32, uses the shared object method because they must use functions in the host operating system as well. Higher-level libraries, such as WineD3D, are free to use the DLL format. In many cases users can choose to load a DLL from Windows instead of the one implemented by wine. Doing so can provide functionalities not yet implemented by wine, but may also cause malfunctions if it relies on something else not present in wine.

Wine tracks its state of implementation through automated unit testing done at every git commit.

# Graphics and gaming
While most office software does not make use of complex GPU-accelerated graphics APIs, computer games do. To run these games properly, Wine would have to forward the drawing instructions to the host OS, and even translate them to something the host can understand.

  # DirectX is a collection of Microsoft APIs for rendering, audio and input. As of 2019, Wine 4.0 contains a DirectX 12 implementation for Vulkan API, and DirectX 11.2 for OpenGL. Wine 4.0 also allows Wine to run Vulkan applications by handing draw commands to the host OS, or in the case of macOS, by translating them into the Metal API by MoltenVK.
  
  # XAudio
  As of February 2019, Wine 4.3 uses the FAudio library (and Wine 4.13 included a fix for it) to implement the XAudio2 audio API (and more).
  
  # XInput and Raw Input
  Wine, since 4.0 (2019), supports game controllers through its builtin implementations of these libraries. They are built as Unix shared objects as they need to access the controller interfaces of the underlying OS, specifically through SDL.
  
  # Direct2D
  Wine 4.0 supports Direct2D 1.2.
  
  # Direct3D
  Much of Wine's DirectX effort goes into building WineD3D, a translation layer from Direct3D and DirectDraw API calls into OpenGL. As of 2019, this component supports up to DirectX 11. As of December 12, 2016, wine is good enough to run Overwatch with D3D11. Besides being used in Wine, WineD3D DLLs are also useful in the Windows Operating System itself, allowing for older graphic cards to run games using newer DirectX versions and for old DDraw-based games to render correctly.

Some work is ongoing to move the Direct3D backend to Vulkan API. Direct3D 12 support in 4.0 is provided by a "vkd3d" subproject, and WineD3D has in 2019 been experimentally ported to use the Vulkan API.

Wine, when patched, can alternatively run Direct3D 9 without translation via the a free and open-source Gallium3D State Tracker for DX9. The Gallium3D layer allows for direct pass-through of drawing commands.

Ref: https://en.wikipedia.org/wiki/Wine_(software)

8. Terraform

Terraform is an open-source infrastructure as code software tool created by HashiCorp. It enables users to define and provision a datacenter infrastructure using a high-level configuration language known as Hashicorp Configuration Language (HCL), or optionally JSON. Terraform supports a number of cloud infrastructure providers such as Amazon Web Services, IBM Cloud (formerly Bluemix), Google Cloud Platform, DigitalOcean, Linode, Microsoft Azure, Oracle Cloud Infrastructure, OVH, or VMware vSphere as well as OpenNebula and OpenStack.

HashiCorp also supports a Terraform Module Registry launched in 2017 during HashiConf 2017 conferences.

Initial release: July 28, 2014; 5 years ago
Stable release: 0.12.19 / January 8, 2020; 12 days ago
Repository: github.com/hashicorp/terraform
Written in: Go
Operating system: Linux, FreeBSD, macOS, OpenBSD, Solaris, and Microsoft Windows
Available in: English
Type: Infrastructure as Code
License: Mozilla Public License v2.0
Website: www.terraform.io

Ref: https://en.wikipedia.org/wiki/Terraform_(software)

9. Microsoft System Center Configuration Manager

Microsoft System Center Configuration Manager (SCCM, also known as ConfigMgr),[1] formerly Systems Management Server (SMS)[2] is a systems management software product developed by Microsoft for managing large groups of computers running Windows NT, Windows Embedded, macOS (OS X), Linux or UNIX, as well as Windows Phone, Symbian, iOS and Android mobile operating systems.[3] Configuration Manager provides remote control, patch management, software distribution, operating system deployment, network access protection and hardware and software inventory.

Components 
- Policy Infrastructure
- Service Window Manager
- State System
- Center Configuration Manager Scheduler (CCM Scheduler)
- Center Configuration Manager Configuration Item Software Developers Kit (CCM CI SDK)
- Desired Configuration Management Agent (DCM Agent)
- Desired Configuration Management Reporting (DCM Reporting)
- MTC
- CI Agent
- CI Store
- CI Downloader
- CI Task Manager
- CI State Store
- Content In[fra]structure
- Software Distribution
- Reporting
- Software Updates
- Operating System Deployment

Ref: https://en.wikipedia.org/wiki/Microsoft_System_Center_Configuration_Manager

10. xlWings

Xlwings is a module to allow Excel to be automated with Python instead of VBA.

With it, you can replace your VBA code with Python, a powerful yet easy-to-use programming language that is highly suited for numerical analysis. Supports Windows and Mac.

Ref: https://stackoverflow.com/tags/xlwings/info

No comments:

Post a Comment