Friday, 1 December 2023

Basic Networking: Hub, Bridge, Switch and Router

(Thanks to Practical Networking for their informative YouTube channel, available here).
 
This follows my previous post.
 
In a network, data crossing a wire connecting two hosts decays as it travels, up to invalidating the sharing of data if the distance is significant.

A repeater is a device which regenerates a signal: when inserted between two hosts, the signal entering one end of the repeater is regenerated on the other side. This allows data to travel greater distances.

Definition: A repeater in networking is a device that amplifies and re-transmits signals, extending the reach of a network by regenerating and boosting the strength of the transmitted data, allowing it to travel over longer distances.

If you have more complex networks with more than two hosts, a device called hub is needed.

The hub connects multiple hosts in a network, allowing them to communicate by forwarding data to all connected devices. Hubs do not make intelligent decisions about where to send data; instead, they broadcast incoming data to all connected devices, regardless of the intended recipient. This can lead to network congestion and inefficiency. Hubs are largely obsolete in modern networks, with switches being more commonly used for better performance and efficiency*.

*(I will write more about this further down).

Hub’s purposes:
  • Connecting hosts directly to each other doesn’t scale; therefore, a central device to handle funnelling communications is needed.
  • We want an easy setup if we connect another host to the existing network: thanks to the hub, this new host will have connectivity to the other hosts in the network.
Hub is a multi-port repeater: it takes a packet from a single host and duplicates it to multiple hosts.

To avoid this, two hubs can be created (each one connected to a specific and closed network of hosts) and linked together through a bridge.
 
Bridges have two ports only, one for each hub (each hub represents a contained network). Bridges know how to contain communications inside one relative network without sending unwanted communications from one hub to the other one. However, bridges learn what hosts are on each side and therefore, if required, they can allow communications between the two relative networks. 

This solution can put two networks in communication, however, what is pushed through the bridge is pushed to all the hosts in the network on the other side. And this is not ideal.
 
Source

If you combine hubs and bridges, you have a switch, which is a central device that facilitates communication within a network without pushing unwanted packets to all the hosts in the network (a switch learns which hosts are on each port). 
 
Networks share the same IP address space. So, if 192.168.1.X is the hypothetical IP space of a network made of various hosts, each host of that network will be identified by a specific number replacing X (e.g., the IP can be 192.168.1.1 for one host, 192.168.1.2 for another host of the same network etc.). Imagine this to be the home Wi-Fi where your computer, phone, printer etc. are connected (after all, these are all hosts identified by an IP).

A switch can only facilitate communications within the same network so if there are two separate networks inside the same organisation and each one of them has a switch, these cannot communicate. To solve this, we need a router.

A router “sits” between networks and facilitates communications between them. A router provides a traffic control point (a logical location) to implement security measures, redirect and filter traffic. It can also connect networks with the internet. Switches cannot provide such filtering (actually, modern switches can but since they sit inside the network and not on its “edge” like a router, filtering is typically not required).

A router learns which networks they are attached to; this knowledge is known as “route”, and it is stored in routing tables.

A router has an IP for each network it is attached to. So, each of its IPs (can be multiple) is part of the IP spaces of the networks attached to it.

If a host in network A wants to communicate with another host in network B, and a router is involved, the router’s IP address which is in the IP space for network A serves as Gateway for the hosts in network A. In other words, it is a host’s way out of their local network.

In an organisation with different departments and locations, it is likely that every department and location has a router and that these routers are connected to the internet (which represents a bunch of different routers itself!). 
 
Routing is the process of moving data between networks; switching instead happens within networks. There are many different devices working with networks (e.g., load balancer), but each one of them performs routing and/or switching.
 

Wednesday, 22 November 2023

Basic Networking: Hosts, IPs and Networks

This is the first of a series of posts investigating the complex world of Networking.

(Thanks to Practical Networking for their informative YouTube channel, available here). 

----------------------------------------------------------------------------------------------------------------------

In the context of Networking, if a device sends and receives traffic*, it is a host.

*(Amount of data moving across a computer network at any given time).

Therefore, computers, laptops, phones, printers, servers, routers and other networked devices are hosts.

Cloud servers or Internet of Things (IoT) devices are hosts too: TVs, Smart Watches, Speakers, Refrigerators… These can all be hosts.

Hosts fall into two main categories: Clients and Servers.

Clients initiate requests, Servers respond.

However, these terms are relative to a specific communication. In another communication, a Server can become the Client and vice versa.

In simpler terms, a Server is a computer with software installed which responds to requests.

----------------------------------------------------------------------------------------------------------------------

An IP address (internet protocol) represents the identity of each host. Each host must have at least one IP address to communicate over a network. Think of it as a phone number or a mail address.

This address is stamped on all communications sent by hosts.

In the message between hosts, there will be a Source address (which defines what host sent the communication, in other words the Client) and the Destination address (which defines what host should receive the communication, in other words the Server). If the Server responds back, these addresses are inverted.

Source

 There are two versions of IP addresses: IPv4 and IPv6.

  • IPv4 addresses are 32-bit numerical labels written in the form of four sets of decimal numbers separated by periods (dots). Each decimal number is called an octet, and it represents 8 bits.
    • Example IPv4 address: 192.168.0.1
    • In binary, an IPv4 address is represented as 32 bits, a combination of thirty-two ones and zeros divided in four octets (e.g., 1 octet = 1000 and 1010). Every octet represents a decimal number (min 0, max 255).
  • IPv6 addresses are 128 bits in length and are written as a series of hexadecimal numbers separated by colons.
    • Example IPv6 address: 2001:0db8:85a3:0000:0000:8a2e:0370:7334
    • In binary, an IPv6 address is represented as 128 bits. 
    • IPv6 was introduced to address the limitations of IPv4, primarily the exhaustion of available IPv4 addresses due to the growth of the internet.

When assigned, IP addresses follow a hierarchy. For a set including subsets, the IP could be 10.X.X.X. A specific subset of that set might have an IP like 10.20.X.X. The subset within the subset of that particular set could potentially possess an IP address resembling 10.20.55.X. This pattern continues recursively.

Why is this important? Because by following the hierarchy, it is easier to pinpoint where a particular host exists. Think of a multinational enterprise with the IP 10.20.55.127: this IP might identify a host for that specific enterprise (10), which is in a specific branch office of that enterprise, such as London (20), which is assigned to a specific team, such as sales (55).

----------------------------------------------------------------------------------------------------------------------

Hosts exist in a network. A network represents a series of connections between hosts which have the purpose of sharing data and resources. Without a network, one should manually transfer data through disks, drives etc. In simple words, networking automates this transfer.

A network is a logical grouping of hosts having similar connectivity. For example, your Wi-Fi internet is a network. All devices of your house connected to it have similar connectivity profiles grouped under one network (your Wi-Fi internet).

Network can contain other networks named subnetwork or subnets (e.g., a school having a network with a subnetwork for each class).

Networks are connected to the Internet (which actually stands for “Interconnected Networks”) to connect them to other networks.

Monday, 2 October 2023

Branches in Git

In the previous post on Git, changes have been made directly on the main branch*.

*(The “master” branch was the default branch name for any git's repository. This has been changed to “main” to promote a more inclusive language).

Let’s see how to set up new branches.

(Thanks to Kevin Stratvert for the tutorial 'Git and GitHub for Beginners Tutorial', available here).

Before we do that, what is a new branch?

It is a copy of the main branch in which we want to modify single aspects of our code without affecting the main branch. Once all the necessary changes have been made, we can merge it back into the main branch.

In other words, you can create a new branch for a new feature or for a bug fix etc. without affecting "main". 

Source

To create a new branch, type this on your terminal:

git branch NameOfTheChange

"NameOfTheChange" was made up. It's just the name of the new branch. 

PS: make sure to be in the right directory where the repository is! E.g.: cd C:\Users\etc.

git branch -> to check how many branches have been made. The * sign tells us in which branch we are currently operating.

For the command above, I used the PowerShell terminal in Visual Studio Code. I discovered that my "main" branch is called “master” by my terminal, despite having specifically named it "main" when I previously set it up. If this is the case, if you are in the "master" branch, use git branch -m main (-m stands for "move/rename").

git switch NameOfTheChange -> to switch branch.

The command git status displays information about the current state of my working directory and the Git repository. As expected, it also informs us about which branch we are currently operating in.

Let’s see the branch in action. I am expecting to make a change to my branch without affecting any other branch.

I have a text file (.txt) in my repository, so I modify and commit it. I proceed in this order:

1) I modify the file manually while being in "NameOfTheChange". Save it.

2) git status -> to verify the modification.

3) git commit -a -m “description of what I am doing” -> to commit the change.

Remember that -a is to force commit for a change we haven’t staged. -m is used to add comments to the commit.

If I now switch the branch, from “NameOfTheChange” to “main”, I expect the text file in the repository to not show my last change. The repository should change according to which branch I am currently working with. I type the switch command:

git switch main

Note: Watch out if your “main” is called “master” like in my case!
 
I open the repository and the text file for "main". It should not display my last change that I made in “NameOfTheChange”. However, both branches display the same text file despite having specifically changed it for one of my branches only.
 
I need to investigate what the issue is.

I created a new file in my repository called text.txt with random words in it. I am in the “NameOfTheChange” branch. I commit it in that branch but, again, both branches display this file. I was expecting only “NameOfTheChange” to display it.

I tried the same process, but I accidentally hit some wrong button on my terminal during the commit. Now any further commit makes me display the following error:

'Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier: remove the file manually to continue.' 

This problem occurs when two git commands are executed simultaneously (e.g., one from the IDE and one from the command prompt, or when a second command is given while the first one is still executing). To solve this, I proceeded in this way:

1) I accessed the hidden folder of Git (.git) in the repository. 

2) I delete the "index.lock" file*. 

3) I checked back my git status

4) I properly committed the previous changes (adding a comment too!)

*(‘When you perform a Git command that edits the index, Git creates a new index.lock file, writes the changes, and then renames the file. The index.lock file indicates to other Git processes that the repository is locked for editing.’ Source).

I noticed that changes on .htm files in my repository are identified by Git, while changes on .txt files are not (in my case, I simply opened a .htm file with Notepad and modified it). I am still looking for an explanation for this behaviour. In conclusion, I tested the branches by modifying a .htm file in one branch and committing the change in that branch only. When I switched the branch I noticed that, as expected, in this last branch the same .htm file had no changes. In this way, I demonstrated that branches are independent: code can be modified in one branch without affecting the same code in other branches. 
 
SOLUTION: In my previous post, I made a .gitignore file which instructed Git to ignore all .txt extensions. These files are then not reported by git status.
 
---------------------------------------------------------------------------------------------------------------------- 

Once we’re happy about the changes, we can merge the new branch with our “main”.

git merge -m “Type a comment here” NameOfTheChange

“NameOfTheChange” is the branch to merge with “main” (you must be in “main” to do it, use git switch).

I can delete now the “NameOfTheChange” branch:

git branch -d NameOfTheChange

-d stands for "delete".
 
Use git branch to confirm that the “NameOfTheChange” branch is no more.

git switch -c NameOfABranch -> create a new branch and automatically switch to it (-c stands for “create”).

When you merge a branch with “main” and both have the same file with different text or code, Git will report a CONFLICT, the type of conflict (e.g., content), and for which file. We must resolve the conflict (e.g., manually) before the merge can take place.

Wednesday, 23 August 2023

An Introduction to Git

My first post on this blog is about Git, one of the most popular version control systems which is used in many software development projects.

Git is a critical tool for managing and collaborating on software projects, because it allows multiple developers to work on the same codebase* simultaneously while keeping track of changes, versions, and history.

*(In other words, the collection of components such as source code, scripts etc. which make up a software application).

While studying Git, I came across some concepts that are unfamiliar to me such as Version Control Systems, Distributed Version Control Systems and Source Code Management.

Let’s delve into them. 


Version Control Systems (VCS): software tools that help track changes to files and code over time. They also cover large binary files and digital assets.

Distributed Version Control Systems (DVCS): A type of VCS that allows every contributor to have all the history of the code repository. Apparently, it is a more decentralised system: for example, contributors can work offline and have access to the full history without relying on a central server.

Source Code Management (SCM): like VCS but only used to manage the source code.

Git is both a DVCS and an SCM!

----------------------------------------------------------------------------------------------------------------------

(Thanks to Kevin Stratvert for the tutorial 'Git and GitHub for Beginners Tutorial', available here).

I proceed to install Git from here and notice that it comes with a terminal called Git Bash. I assume that I can use another terminal that I have on my computer such as Command Prompt rather than Git Bash. Indeed, if I type the command

git config -h

on Command Prompt, it displays the help documentation for Git. Another useful command is

git help config

As suggested by tutorials, before I start using Git, I should specify my name and email address to associate an identity to my commits*.

*(The saving of a set of changes one has made to the project's source code or files; I think of it as a sort of snapshot!).

These are the configurations that I am going to insert in Command Prompt

git config --global user.name "Gian"

git config --global user.email "my.email@example.com"


Then I define a branch, in other words a separate "copy" of the project that might allow me to work on new features, bug fixes, or experiments without affecting the main codebase. I decided to call my branch, “main”.

git config --global init.default branch main

To create a repository (“repo”), I identify a path on my pc where I keep the files that can make up a website. That will be the location for my repository. I first go to the chosen directory

cd C:\Users\etc.

Then

git init

In this way, I created a location where all the source code, project files, and historical changes related to a software project are stored and tracked. I notice that a hidden folder .git is created in my chosen directory; this probably contains necessary components to make Git work (like the heart of the repository).

I believe that if I were collaborating with other developers, this location should have been on a hosting service such as GitHub but I am unsure.

The command git status displays information about the current state of my working directory and the Git repository. In my case, for “C:\Users\etc.”, it displays the name of my branch (“main”) and specifies that there are no commits yet. It also lists the files I put in “C:\Users\etc.” and calls them "Untracked" (in a red font). These are actually not monitored by Git: 

  • Git does not track changes or history for Untracked files.
  • These files do not appear in the commit history, and Git doesn't automatically include them in commits.
At this point I feel confused about where the branch information is saved and I try the command “git status” on another terminal, Git Bash. The result is: “fatal: this operation must be run in a work tree”.

Maybe, I need to run it on the same folder, so I repeat

cd C:\Users\etc.

It doesn’t work. Let’s redo

git init

And now it works. But why? Why has changing the terminal forced me to repeat the initialisation process for the specified path? I assumed that specifying the path was enough. Anyway…

I want to Track an Untracked file called file.htm which is in “C:\Users\etc.”, as the status reported.

git add file.htm

I re-check the status with git status. That specific file is now in a green font and not in red, and it is preceded by instructions about how to revert this operation (I tried this suggested command to Untrack the file and checked back with git status… it works!).

I create a .gitignore file out of an empty text file and put it in my chosen directory. Here I can list all the files I want to ignore.

Example:

# ignore all .txt files
*.txt

The first line is a comment describing my intentions. The second one instructs git that I want to ignore all(*) .txt extensions. These files won’t be reported by git status anymore. I save the .gitignore file.

To put files in an environment called staging, a preparatory phase before committing (like holding the pen before writing...), I can use either

1) git add –all

or

2) git add –A

or

3) git add .

1) and 2) are the same: they stage all changes, including new files, modifications, and deletions, both for Tracked and Untracked files

3) It stages changes for Tracked files and new files but does not stage file deletions.

It is still unclear to me the purpose of staging. Couldn’t we have a direct commit without the staging?

Anyway, I am ready to commit*.

*(You can only commit what has been staged unless you force it with git commit -a). 

In my experience, I found it unwise to force commit without staging because it might cause tracking issues.

git commit -m “description of what I am doing”

-m (message) is to type the comment. If I use git status, there are no more files shown to commit.

I also found that writing a comment for each commit can be extremely useful because it helps us and the team identify the change and the reason behind it. 

----------------------------------------------------------------------------------------------------------------------

If I modify one of the files in my folder and I use the command git status, it will inform me that one file has been modified (the file goes back to the working phase before staging). git diff to display the actual change (red means old, green means new).

git add file.htm -> to stage files.

git restore --staged file.htm -> to remove a file from staging (before commit).

git restore “file.htm” -> to restore a deleted file.

git mv “file.htm” “file2.htm” -> to move (rename) a file. Remember to commit it!

git log -> to display the list of commits made.

git log --oneline -> to display log in one line.

git commit -m “new amended comment…” --amend -> to amend the last commit.

git log -p -> to display the commit history with the actual changes (diffs) introduced in each commit ("q" to quit).

git rebase -i --root -> to open the editor (":x" to quit).

Integration of Cloud Technologies with the Metaverse

The potential impact and timeline for the development of the Metaverse remain uncertain, with ongoing debate over whether it represents a me...