Package 'ERDbuilder'

Title: Entity Relationship Diagrams Builder
Description: Build entity relationship diagrams (ERD) to specify the nature of the relationship between tables in a database.
Authors: Jonathan Wood [aut] , Guillermo Basulto-Elias [aut, cre]
Maintainer: Guillermo Basulto-Elias <[email protected]>
License: GPL (>= 3)
Version: 1.0.0
Built: 2025-01-30 18:30:31 UTC
Source: https://github.com/gbasulto/erdbuilder

Help Index


Create ERD Object

Description

This function serves as a constructor for an Entity-Relationship Diagram (ERD) object. This object encapsulates both the data frames representing the entities and the relationships between these entities. The function takes as its arguments a list of data frames and a list of relationships and returns a list object of class "ERD".

Usage

create_erd(df_list, relationships)

Arguments

df_list

A named list of data frames, where each data frame represents an entity in the ERD. The names of the list elements correspond to the names of the entities.

relationships

A nested named list describing the relationships between entities. The top-level names in this list should correspond to the names in df_list. Each element of this list is itself a list, describing relationships that the corresponding entity has with other entities. The list of acceptable values is specified in "Details."

Details

Possible values in each relationship element of the list include:

"||"

which indicates one and only one

">|"

which indicates one or more (left table)

"|<"

which indicates one or more (right table)

">0"

which indicates zero or more (left table)

"0<"

which indicates zero or more (right table)

"|0"

which indicates zero or one (left table)

"0|"

which indicates zero or one (right table)

It is imperative that the names used in df_list and relationships are consistent, as these are used for creating the ERD object and for subsequent operations like rendering and performing joins.

Users can effortlessly encapsulate the data and relationships pertaining to an ERD into a single R object with this function, thereby facilitating downstream operations like rendering and joining.

Value

An object of class "ERD", which is a named list containing two elements:

data_frames

Named list of data frames identical to df_list.

relationship

Named list of relationships identical to relationships.

The class attribute of this list is set to "ERD".

Examples

# Load Packages -----------------------------------------------------------

library(ERDbuilder)
library(dplyr)

# Define entities ---------------------------------------------------------

students_tbl <- data.frame(
  st_id = c("hu1", "de2", "lo3"),
  dep_id = c("water", "evil", "values"),
  student = c("Huey", "Dewey", "Louie"),
  email = c("hubert.duck", "dewfort.duck", "llewellyn.duck"),
  dob = c("04-15", "04-15", "04-15")
)

courses_tbl <- data.frame(
  crs_id = c("water101", "evil205", "water202"),
  fac_id = c("02do", "03pe", "04mi"),
  dep_id = c("water", "evil", "water"),
  course = c("Swimming", "Human-chasing", "Dives")
)

enrollment_tbl <- data.frame(
  crs_id = c("water101", "evil205", "evil205", "water202"),
  st_id = c("hu1", "hu1", "de2", "de2"),
  final_grade = c("B", "A", "A", "F")
)

department_tbl <- data.frame(
  dep_id = c("water", "evil", "values"),
  department = c("Water activities", "Evil procurement", "Good values")
)

faculty_tbl <- data.frame(
  faculty_name = c("Scrooge McDuck", "Donald", "Pete", "Mickey"),
  title = c("Emeritus", "Full", "Assistant", "Full"),
  fac_id = c("01sc", "02do", "03pe", "04mi"),
  dep_id = c("water", "water", "evil", "values")
)

head(students_tbl)
head(courses_tbl)
head(enrollment_tbl)
head(department_tbl)
head(faculty_tbl)

## Define relationships----------------------------------------
relationships <- list(
  courses = list(
    enrollment = list(crs_id = "crs_id", relationship = c("||", "|<")),
    department = list(dep_id = "dep_id", relationship = c(">|", "||")),
    faculty = list(fac_id = "fac_id", relationship = c(">0", "||"))
  ),
  enrollment = list(
    students = list(st_id = "st_id", relationship = c(">0", "||")
    )
  ),
  students = list(
    department = list(dep_id = "dep_id", relationship = c(">|", "||"))
  ),
  faculty = list(
    department = list(dep_id = "dep_id", relationship = c(">|", "||"))
  )
)

## Create ERD object
erd_object <-
  create_erd(
    list(
      students = students_tbl,
      courses = courses_tbl,
      enrollment = enrollment_tbl,
      department = department_tbl,
      faculty = faculty_tbl
    ),
    relationships)

## Render ERD -----------------------------------------------------------
render_erd(erd_object, label_distance = 0, label_angle = 15, n = 20)

Use inner join (unless the other is specified)

Description

The perform_join function uses an inner join unless the user specifies the join type.

Usage

perform_join(erd_object, tables_to_join, specified_joins = NULL)

Arguments

erd_object

An object of class "ERD", which encapsulates the data frames and the relationships between them. This object is generated using the create_erd function.

tables_to_join

A character vector listing the names of tables to join. The first table in this list serves as the main table to which subsequent tables are joined. The tables are joined in the order specified and utilize the relationships defined with the first table.

specified_joins

An optional named list where each element's name corresponds to a table in tables_to_join and the value specifies the type of join to perform with that table. The default value is NULL, which activates automated mode (which uses inner joins).

Details

This orchestrates the joining of multiple tables based on a specified Entity-Relationship Diagram (ERD) object. This function extracts the relationships and join criteria defined within the ERD object and executes the appropriate join operations using R's dplyr package.

The function can operate in two modes: automated and user-specified joins. In automated mode, join types are determined by the relationship symbols in the ERD object. In user-specified mode, the types of joins are explicitly provided by the user.

Implementation Details:

- Join Variables: For each pair of tables to be joined, the function extracts the relevant join variables from the ERD object.

- Join Type: Depending on the relationship symbol associated with each pair of tables, the function decides whether to perform an inner join or a left join. This decision is implemented by dynamically invoking the corresponding dplyr function (inner_join or left_join).

- Aggregation: The function uses do.call to dynamically execute the appropriate join operation, accumulating the result in the main_table variable, which is ultimately returned.

Notes:

- The function iteratively applies the join operations, using the first table in tables_to_join as the main table.

- The join operations are performed in the order specified in tables_to_join.

- When specified_joins is NULL, the function operates in automated mode, determining the type of join based on relationship symbols.

- The names in specified_joins should match the table names in tables_to_join for user-specified mode to function correctly.

Value

A data frame resulting from the join operations conducted between the specified tables, consistent with the relationships indicated in the ERD object. Additionally, the types of joins used are printed to the console.

Examples

# Load Packages -----------------------------------------------------------

library(ERDbuilder)
library(dplyr)

# Define entities ---------------------------------------------------------

students_tbl <- data.frame(
  st_id = c("hu1", "de2", "lo3"),
  dep_id = c("water", "evil", "values"),
  student = c("Huey", "Dewey", "Louie"),
  email = c("hubert.duck", "dewfort.duck", "llewellyn.duck"),
  dob = c("04-15", "04-15", "04-15")
)

courses_tbl <- data.frame(
  crs_id = c("water101", "evil205", "water202"),
  fac_id = c("02do", "03pe", "04mi"),
  dep_id = c("water", "evil", "water"),
  course = c("Swimming", "Human-chasing", "Dives")
)

enrollment_tbl <- data.frame(
  crs_id = c("water101", "evil205", "evil205", "water202"),
  st_id = c("hu1", "hu1", "de2", "de2"),
  final_grade = c("B", "A", "A", "F")
)

department_tbl <- data.frame(
  dep_id = c("water", "evil", "values"),
  department = c("Water activities", "Evil procurement", "Good values")
)

faculty_tbl <- data.frame(
  faculty_name = c("Scrooge McDuck", "Donald", "Pete", "Mickey"),
  title = c("Emeritus", "Full", "Assistant", "Full"),
  fac_id = c("01sc", "02do", "03pe", "04mi"),
  dep_id = c("water", "water", "evil", "values")
)

head(students_tbl)
head(courses_tbl)
head(enrollment_tbl)
head(department_tbl)
head(faculty_tbl)

## Define relationships----------------------------------------
relationships <- list(
  courses = list(
    enrollment = list(crs_id = "crs_id", relationship = c("||", "|<")),
    department = list(dep_id = "dep_id", relationship = c(">|", "||")),
    faculty = list(fac_id = "fac_id", relationship = c(">0", "||"))
  ),
  enrollment = list(
    students = list(st_id = "st_id", relationship = c(">0", "||")
    )
  ),
  students = list(
    department = list(dep_id = "dep_id", relationship = c(">|", "||"))
  ),
  faculty = list(
    department = list(dep_id = "dep_id", relationship = c(">|", "||"))
  )
)

## Create ERD object
erd_object <-
  create_erd(
    list(
      students = students_tbl,
      courses = courses_tbl,
      enrollment = enrollment_tbl,
      department = department_tbl,
      faculty = faculty_tbl
    ),
    relationships)

## Render ERD -----------------------------------------------------------
render_erd(erd_object, label_distance = 0, label_angle = 15, n = 20)


# Perform joins -----------------------------------------------------------

perform_join(erd_object, c("courses", "enrollment", "department"))

Render ERD

Description

The render_erd function graphically renders ERD using DiagrammeR, incorporating pseudo-nodes to depict traditional ERD notations such as cardinality. This function uses edge attributes to append text labels near the end of the edge lines. Specifically, the DiagrammeR label attribute is leveraged to include text labels at the ends of the edges, which effectively convey the intended cardinality and relationship information. This function constructs edge labels from two strings representing the left and right relationship attributes.

Usage

render_erd(erd_object, label_distance = 2.5, label_angle = 45, n = 10)

Arguments

erd_object

An object of class "ERD", generated using the link{create_erd} function. This object encapsulates the data frames representing the entities and the relationships between these entities.

label_distance

A numeric value that specifies the distance between the edge labels and the lines connecting the nodes. The default value is 2.5.

label_angle

A numeric value that specifies the angle at which the edge labels are displayed. The default value is 45 degrees.

n

The maximum number of rows in each table. The tables will add columns to show all of the variables in the tables such that there are only 'n' rows.

Details

This function is responsible for graphically rendering an Entity-Relationship Diagram (ERD) based on an object of class "ERD". This function leverages the DiagrammeR package to generate a graph that visually represents both the entities and the relationships contained within the ERD object. The entities are represented as nodes, and the relationships as edges. The render_erd function thus provides a robust mechanism for visually representing an ERD based on structured data within the R environment. By converting an ERD object into a graphical form, the function aids in a clearer understanding and communication of complex data relationships.

Nodes: Each entity (i.e., data frame) is represented as a node. The node label consists of the entity name and the attribute names within the entity.

Edges: Relationships between entities are represented as edges between the corresponding nodes. Labels at the ends of the edges indicate the type and cardinality of the relationship.

The label_distance and label_angle parameters control the presentation of edge labels in the ERD to minimize overlap and improve readability.

The function uses a for loop to iterate through the entities and relationships, constructing the necessary DiagrammeR code to render each element. The nodesep and ranksep parameters in the DiagrammeR code control the node spacing in the rendered ERD, making it easier to visualize complex ERDs.

Value

A DiagrammeR graph object representing the ERD.

Examples

# Load Packages -----------------------------------------------------------

library(ERDbuilder)
library(dplyr)

# Define entities ---------------------------------------------------------

students_tbl <- data.frame(
  st_id = c("hu1", "de2", "lo3"),
  dep_id = c("water", "evil", "values"),
  student = c("Huey", "Dewey", "Louie"),
  email = c("hubert.duck", "dewfort.duck", "llewellyn.duck"),
  dob = c("04-15", "04-15", "04-15")
)

courses_tbl <- data.frame(
  crs_id = c("water101", "evil205", "water202"),
  fac_id = c("02do", "03pe", "04mi"),
  dep_id = c("water", "evil", "water"),
  course = c("Swimming", "Human-chasing", "Dives")
)

enrollment_tbl <- data.frame(
  crs_id = c("water101", "evil205", "evil205", "water202"),
  st_id = c("hu1", "hu1", "de2", "de2"),
  final_grade = c("B", "A", "A", "F")
)

department_tbl <- data.frame(
  dep_id = c("water", "evil", "values"),
  department = c("Water activities", "Evil procurement", "Good values")
)

faculty_tbl <- data.frame(
  faculty_name = c("Scrooge McDuck", "Donald", "Pete", "Mickey"),
  title = c("Emeritus", "Full", "Assistant", "Full"),
  fac_id = c("01sc", "02do", "03pe", "04mi"),
  dep_id = c("water", "water", "evil", "values")
)

head(students_tbl)
head(courses_tbl)
head(enrollment_tbl)
head(department_tbl)
head(faculty_tbl)

## Define relationships----------------------------------------
relationships <- list(
  courses = list(
    enrollment = list(crs_id = "crs_id", relationship = c("||", "|<")),
    department = list(dep_id = "dep_id", relationship = c(">|", "||")),
    faculty = list(fac_id = "fac_id", relationship = c(">0", "||"))
  ),
  enrollment = list(
    students = list(st_id = "st_id", relationship = c(">0", "||")
    )
  ),
  students = list(
    department = list(dep_id = "dep_id", relationship = c(">|", "||"))
  ),
  faculty = list(
    department = list(dep_id = "dep_id", relationship = c(">|", "||"))
  )
)

## Create ERD object
erd_object <-
  create_erd(
    list(
      students = students_tbl,
      courses = courses_tbl,
      enrollment = enrollment_tbl,
      department = department_tbl,
      faculty = faculty_tbl
    ),
    relationships)

## Render ERD -----------------------------------------------------------
render_erd(erd_object, label_distance = 0, label_angle = 15, n = 20)