Data flow problem

WprowadzenieIntroduction

~~Aby~~To ~~skutecznie~~effectively ~~wprowadzić~~integrate AI dointo EO iand ~~umożliwić~~enable ~~poszczególnym~~individual ~~chapterom~~chapters ~~wykorzystanie~~to ~~pełni~~fully ~~możliwości~~leverage ~~tej~~the ~~technologii,~~potential ~~należy~~of ~~najpierw~~this ~~rozwiazać~~technology, ~~problem~~the ~~przepływu~~issue ~~danych~~of ~~wewnątrz~~data ~~naszej~~flow ~~organizacji.~~within our organization must first be addressed.

~~Każdy~~Each chapter ~~posiada~~has ~~dwa~~two ~~główne~~main ~~rodzaje~~types ~~danych:~~of data:

~~Dane~~Data ~~swojego~~specific ~~chapteru,~~to ~~indywidualne~~their ~~dla~~chapter, ~~danego~~unique ~~chapteru~~to ithat ~~dostępne~~chapter ~~tylko~~and ~~dla~~accessible ~~członków~~only ~~danego~~to ~~chapteru~~its -members ~~jak~~– ~~np.~~for ~~informacje~~example, oinformation ~~członkach~~about ~~chapteru,~~chapter ~~budżecie~~members, ~~chapteru,~~chapter ~~wewnętrznych~~budgets, ~~procedurach~~internal ~~chapteru~~chapter ~~itp.~~procedures, etc.
~~Dane~~Data ~~wspólne~~shared ~~dla~~across ~~całego~~the entire EO -– ~~jak~~such ~~oficjalne~~as ~~materiały~~official ~~szkoleniowe~~training omaterials ~~forum,~~for ~~poradniki~~forums, ~~dla~~guides ~~poszczególnych~~for ~~członków~~individual ~~zarządu,~~board ~~materiały~~members, ~~brandingowe~~and branding materials.

~~Obydwa~~Both ~~punkty~~points ~~wymagają~~require ~~przemyślenia~~careful iconsideration ~~zaprojektowania,~~and ~~ale~~design, wbut ~~tej~~in ~~dyskusji~~this ~~chciałbym~~discussion, ~~się~~I ~~skupić~~would nalike ~~punkcie~~to ~~numer~~focus on point number 2 -– ~~czyli~~data ~~danych~~shared ~~wspólnych~~across ~~dla~~the ~~całego~~entire EO.

ObecnyCurrent schematData przepływuFlow danychModel

WIn ~~obecnym~~the ~~rozwiązaniu~~current AI ~~dla~~solution EOfor ~~wykorzystujemy~~EO, ~~oprogramowanie~~we ~~CogniVis~~use ~~AI.~~

Wthe CogniVis AI ~~tworzymy~~software.

~~osobną~~

In ~~instancję~~CogniVis /AI, ~~jednostkę~~we ~~oprogramowania~~create ~~dla~~a ~~każdego~~separate ~~chapteru.~~instance/unit ~~Dzięki~~of ~~temu~~the ~~każdy~~software for each chapter. This ensures that each chapter mahas ~~pełnię~~full ~~kontroli~~control ~~nad~~over ~~swoimi~~its ~~danymi~~data iand ~~może~~can ~~swobodnie~~freely ~~zarządzać~~manage ~~kontami~~user ~~użytkowników~~accounts ~~dla~~for ~~swojego~~its ~~chapteru.~~chapter.

Link doto ~~podglądu~~the ~~powyższego~~diagram ~~schematu:~~above: [https://docs.google.com/drawings/d/1PgJEkZtRAytCi81ziFLFp5_FqXDedBI6m-LUXUZshew/edit?usp=~~sharing~~sharing](https://docs.google.com/drawings/d/1PgJEkZtRAytCi81ziFLFp5

~~Legenda:~~Legend:

~~Niebieskie~~ ~~cylindry~~
Blue tocylinders ~~instancje~~represent CogniVis ~~dla~~instances ~~poszczególnych~~for ~~chapterów~~individual chapters (~~przykładowo dla~~e.g., EO Poland, EO Berlin, EO Argentina).
~~Zielone~~ ~~prostopadłościany~~
Green torectangles ~~podłączone~~represent ~~źródła~~connected ~~danych~~data sources (~~tak zwane~~so-called connectors) dofor ~~instancji~~each ~~danego~~chapter’s ~~chapteru.~~instance. ~~Każdy~~Each chapter ~~może~~can ~~używać~~use ~~różnych~~different ~~źródeł~~data ~~danych~~sources -– ~~przykładowo~~for example, EO Poland ~~może~~may ~~korzystać~~use ~~ze swojego~~its Google Drive, EO Berlin zits Microsoft ~~Sharepoint~~SharePoint, aand EO Argentina zits ~~Dropboxa.~~Dropbox.

Adding Data to a Chapter Instance

Dodawanie
Let's danychconsider doa instancjisimple chapteru

example:

~~Przyjmijmy następujący prosty przykład: każdy~~each chapter ~~chce~~wants ~~dodać~~to doadd ~~swojej~~two ~~instancji~~files 2to ~~pliki,~~its ~~aby~~instance ~~później~~so that AI ~~mógł~~can zlater ~~nich~~use ~~korzystać~~them ito ~~odpowiadać~~respond nato ~~pytania~~questions zrelated ~~nimi~~to ~~związane:~~these files:

1. ~~Pierwszy~~The ~~plik~~first tofile ~~arkusz~~is za ~~danymi~~spreadsheet ~~członków~~with ~~danego~~data ~~chapteru~~on the members of that chapter

~~Przykładowy~~Example ~~arkusz~~spreadsheet zwith ~~danymi~~member ~~członków:~~data: https://docs.google.com/spreadsheets/d/1BbusZF1i6689Je_JOENt4arsVNTTC9phJ0NmznI51Ug/edit?usp=sharing

~~Każdy~~Each chapter ~~będzie~~will ~~miał~~have ~~taki~~such ~~arkusz~~a ~~osobno~~spreadsheet ~~dla~~separately, ~~siebie~~and ~~i każdy~~each chapter ~~chce,~~wants ~~aby~~its ~~tylko~~sheet ~~jego~~to ~~członkowie~~be ~~mieli~~accessible ~~dostęp~~only doto ~~jego~~its ~~arkusza.~~members.

~~Każdy~~Thus, each chapter ~~doda~~will ~~więc~~add ~~taki~~this ~~arkusz~~spreadsheet (~~oznaczony~~marked ~~fioletowym~~purple ~~kolorem~~in nathe ~~diagramie~~diagram ~~poniżej)~~below) doto ~~swojego~~its ~~źródła~~data ~~danych.~~source. ~~Trzymając~~Following ~~się~~our ~~naszego~~example ~~przykładowego schematu~~schema (see diagram ~~poniżej)~~below):

EO Poland ~~doda~~will ~~arkusz~~add the sheet "EO Poland Member Information Sheet" doto ~~swojego~~its Google ~~Drive~~Drive.
EO Berlin ~~doda~~will ~~arkusz~~add the sheet "EO Berlin Member Information Sheet" doto ~~swojego~~its Microsoft ~~Sharepoint~~SharePoint.
EO Argentina ~~doda~~will ~~arkusz~~add the sheet "EO Argentina Member Information Sheet" doto ~~swojego~~its ~~Dropboxa~~Dropbox.

2. ~~Drugi~~The ~~plik~~second tofile is the PDF "SampleForumAgenda.~~pdf",~~pdf," ~~który~~which ~~jest~~is ~~oficjalnym~~an ~~dokumentem~~official ~~ściągniętym~~document zedownloaded ~~strony~~from https://www.eonetwork.org/

WAgain, ~~tym przypadku ponownie każdy~~each chapter ~~doda~~will doadd ~~swojego~~this ~~źródła danych wspomniany plik~~file "SampleForumAgenda.pdf" (~~oznaczony~~marked ~~czerwonym~~red ~~kolorem~~in nathe ~~diagramie~~diagram ~~poniżej)~~below) to its data source.

Link doto ~~podglądu~~the ~~powyższego~~diagram ~~schematu:~~above: https://docs.google.com/drawings/d/1bv84hB65vT7kwyh99-chpx1RZiR2TwXHu-xFXwqT3pM/edit?usp=sharing

IstotnaProblem problemuAnalysis

WIn ~~powyższym~~the ~~schemacie~~above ~~przepływu~~data ~~danych~~flow ~~słuszne~~schema, ~~jest,~~it żeis ~~każdy~~correct that each chapter ~~doda~~adds doits ~~swojej~~member ~~instancji~~data sheet to its CogniVis ~~arkusz~~instance, zas ~~danymi swoich członków, ponieważ każdy~~each chapter ~~będzie~~will ~~miał~~have ~~ten~~a ~~plik~~different ~~inny~~file, iand ~~dostęp~~access doshould ~~niego~~be ~~powinien~~restricted ~~być~~within ~~ograniczony~~that ~~tylko w obrębie danej instancji.~~instance.

~~Natomiast~~However, ~~sytuacją~~it ~~nieoptymalną~~is ~~jest,~~suboptimal żethat ~~plik~~the "SampleForumAgenda.pdf" ~~również~~file ~~dodawany~~is ~~jest~~also ~~indywidualnie~~added doindividually ~~każdej~~to ~~instancji,~~each ~~pomimo~~instance, ~~tego~~despite żebeing ~~jest~~identical ~~identyczny,~~and acontaining ~~zawarte~~data wshared ~~nim~~by ~~dane~~all sąEO ~~wspólne dla wszystkich chapterów EO.~~chapters.

~~Jeśli~~For ~~przykładowo~~example, if EO Global ~~wypuści~~releases ~~nową~~a ~~wersję~~new ~~tego~~version ~~pliku,~~of this file, all chapters will have to ~~wszystkie~~update ~~chaptery~~it ~~będą~~individually ~~musiały~~in ~~indywidualnie~~their ~~dokonać~~instances, ~~aktualizacji,~~adding ~~każda~~a nalot ~~swojej~~of ~~instancji.~~maintenance ~~Dokłada~~work and creating risks, such as a chapter forgetting to ~~mnóstwo~~update ~~pracy~~and ~~związanej~~using zoutdated ~~utrzymaniem~~versions iof ~~tworzy~~the ~~wiele~~official ~~ryzyk,~~EO ~~np.~~documents.

że

Moreover, ~~jakiś~~the issue becomes more complex considering the large volume of official EO data and documents, and the continuous release of new ones. If each chapter ~~zapomni~~has ~~tego~~to ~~zrobić~~individually iupdate ~~nie~~these ~~będzie~~files, ~~korzystać~~data zdiscrepancies ~~najaktualniejszej~~will ~~wersji~~quickly ~~oficjalnych~~emerge, ~~dokumentów~~leading ~~EO.~~

~~Dodatkowo~~inconsistencies ~~sprawę~~and, ~~skomplikuje~~eventually, ~~fakt,~~complete żedisarray, ~~oficjalnych~~significantly ~~danych~~reducing ithe ~~dokumentów~~effectiveness odof EOthe ~~jest~~AI ~~bardzo~~that ~~wiele~~relies ion ~~ciągle~~this pojawiają się nowe. Jeśli każdy chapter będzie musiał indywidualnie dokonywać aktualizacji tych danych bardzo szybko pojawią się rozjazdy / różnice w tych danych, aż w końcu zapanuje zupełny bałagan, przez co efektywność sztucznej inteligencji, która na tych danych ma się opierać, bardzo spadnie.data.

RozwiązanieSolution iand sugerowanySuggested schematData przepływuFlow danychModel

~~Należy~~The ~~zmienić~~data ~~przepływ~~flow ~~danych~~should ~~tak,~~be ~~aby~~changed ~~oficjalne~~so ~~dokumenty~~that ithe ~~dane~~official zdocuments and data from EO Global, ~~które~~which sąare ~~wspólne~~shared ~~dla~~by ~~wszystkich~~all ~~chapterów~~EO ~~EO,~~chapters, ~~miały~~have ~~swoje~~a ~~jedne~~single ~~źródło,~~source zfrom ~~którego~~which ~~następnie zaciągać będą dane wszystkie instancje~~all CogniVis ~~wszystkich~~instances ~~chapterów.~~for all chapters can pull data.

WIn ~~tej~~this ~~sytuacji~~scenario, ~~dane~~individual ~~indywidualne~~data (~~jak~~such ~~arkusze~~as zmember ~~danymi~~data ~~członków~~sheets ~~chapteru~~– -marked ~~zaznaczone~~in napurple) ~~fioletowo)~~would ~~nadal~~still ~~będą~~be ~~indywidualnie~~added ~~dodawane~~individually ~~przez~~by ~~każdy~~each chapter doto ~~swojej~~its ~~instnacji.~~own instance.

~~Jednak~~However, ~~dane~~common ~~wspólne~~data ~~dla~~for ~~całego~~the entire EO (~~jak~~such ~~plik~~as the "SampleForumAgenda.pdf" -file ~~zaznaczony~~– namarked ~~czerwono)~~in ~~powinny~~red) ~~znajdować~~should ~~się~~be wstored ~~jednym,~~in ~~oficjalnym~~a ~~repozytorium~~single, ~~danych~~official EO ~~Global,~~Global ~~które~~data ~~zawierałoby~~repository ~~zawsze~~that ~~najaktualniejsze~~always ~~dane.~~contains the most up-to-date data.

~~Wtedy~~Then, ~~wszystkie instancje~~all CogniVis ~~wszystkich~~instances ~~chapterów~~for all EO ~~mogłyby~~chapters ~~zaciągać~~could ~~oficjalne~~pull ~~globalne~~official ~~dane~~global zdata ~~repozytorium~~from the EO ~~Global,~~Global arepository, ~~swoje~~while ~~prywatne~~adding ~~dane~~their ~~dodawać~~private ~~indywidualnie~~data doindividually ~~swoich~~to ~~instancji.~~their own instances.

~~Dzięki~~This ~~temu~~would ~~znacząco~~significantly ~~zmniejszy~~reduce ~~się~~the ~~ciężar~~maintenance ~~utrzymania~~burden ~~danych~~of ~~wspólnych~~shared ~~dla~~data ~~całego~~for the entire EO, boas ~~wystarczy~~it jewould ~~utrzymywać~~only ineed ~~aktualizować~~to ~~tylko~~be wmaintained ~~jednym~~and ~~miejscu.~~updated in one place.

Link doto ~~podglądu~~the ~~powyższego~~diagram ~~schematu:~~above: https://docs.google.com/drawings/d/1t2FvtLyfs-qvEVp2gA4pqaPdub46hZOI9KgwzRY8gPI/edit?usp=sharing

CzymWhat dokładnieExactly powinnoShould być Oficjalne Repozytorium Danychthe EO Global?Global Official Data Repository Be?

~~Poniżej~~Below ~~kilka~~are ~~propozycji~~some isuggestions ~~uwag~~and doconsiderations ~~możliwych~~for ~~rozwiązań:~~possible solutions:

1. Cloud ~~storage~~Storage

WIn ~~najprostszym~~the ~~rozwiązaniu~~simplest ~~mogłoby być to przygotowane przez~~solution, EO Global could set up cloud storage (~~Dysk~~Google ~~Google,~~Drive, Microsoft ~~Sharepoint,~~SharePoint, ~~Dropbox~~Dropbox, ~~itp),~~etc.) ~~które~~that ~~byłoby~~would ~~regularnie~~be ~~utrzymywane~~regularly imaintained ~~aprobowane~~and ~~przez~~approved ~~zespół~~by the EO ~~Global.~~Global team.

2. ~~Komunikacja przez~~ API zCommunication with https://hub.eonetwork.org/

~~Bardziej~~A ~~zaawansowanym~~more ~~rozwiązaniem~~advanced ~~byłoby~~solution ~~umożliwienie~~would ~~bezpośredniej~~be ~~komunikacji~~enabling ~~poprzez~~direct API ~~pomiędzy~~communication ~~instancjami~~between the CogniVis ainstances and https://hub.eonetwork.org/.

The question is whether the data on https://hub.eonetwork.org/ .

~~Pytanie~~regularly ~~czy~~maintained ~~dane~~and naalways ~~https://hub.eonetwork.org/~~contains ~~faktycznie~~the sąlatest ~~regularnie~~versions ~~utrzymywane~~of iall ~~zawierają zawsze tylko najaktualniejsze wersje wszystkich dokumentów?~~documents.

ProblemIssue zwith plikamiPDFs PDFfor z oficjalnych dokumentówOfficial EO Global Documents

CogniVis AI ~~dobrze~~performs ~~radzi~~well ~~sobie~~in ~~z czytaniem plików~~reading PDF ifiles ~~znakomitej~~and, ~~większości~~in ~~udziela~~most ~~poprawnych~~cases, ~~odpowiedzi~~provides nacorrect ~~ich~~responses ~~podstawie.~~based on them.

~~Jednakże pliki~~However, PDF files (athe ~~większość~~format ~~dokumentów~~of most EO Global ~~jest~~documents) ware ~~takiej~~not ~~formie)~~the ~~nie~~ideal sąsolution ~~najlepszym~~and ~~rozwiązaniem~~create imany wlong-term ~~długoterminowym~~complications, ~~użytkowaniu~~such ~~tworzą wiele komplikacji, takich jak:~~as:

~~Trudności~~Difficulties zin ~~ekstrakcją~~data ~~danych~~extraction: ~~Struktura~~The ~~plików~~structure of PDF ~~jest~~files ~~projektowana~~is ~~przede~~designed ~~wszystkim~~primarily dofor ~~prezentacji~~visual ~~treści~~presentation wrather ~~sposób~~than ~~wizualny,~~storing aand ~~nie~~processing doinformation ~~przechowywania~~by ~~i przetwarzania informacji przez maszyny.~~machines. AI ~~często~~often ~~napotyka~~encounters ~~problemy~~issues zin ~~poprawnym~~correctly ~~rozpoznawaniem~~recognizing ~~tekstu,~~text, ~~tabel,~~tables, ~~grafik~~graphics, ~~oraz~~and ~~układu~~document ~~dokumentu,~~layout, coleading ~~prowadzi~~to doerrors ~~błędów~~in wdata ~~ekstrakcji danych.~~extraction.
~~Brak~~Lack ~~spójnej~~of ~~struktury~~:consistent Pliki PDF nie mają ujednoliconego standardu układu danych. Nawet w podobnych dokumentach formatowanie może się różnić, co utrudnia AI interpretację informacji, takich jak nagłówki, listy czy sekcje tekstu.

~~Ograniczony dostęp do metadanych~~: W przeciwieństwie do innych formatów, takich jak JSON, XML czy CSV, pliki PDF nie zawierają strukturalnych metadanych, które mogą być łatwo analizowane przez algorytmy. To znacznie ogranicza możliwości wyszukiwania i filtrowania informacji.

~~Problemy z kodowaniem znaków~~structure: PDF ~~może~~files ~~przechowywać~~do ~~tekst~~not whave ~~różnych~~a ~~formatach~~unified ~~kodowania,~~standard cofor ~~często~~data ~~powoduje~~layout. ~~problemy~~Even zin ~~rozpoznawaniem~~similar ~~niektórych~~documents, ~~znaków,~~formatting ~~szczególnie~~may wvary, ~~dokumentach~~complicating ~~wielojęzycznych~~AI's ~~lub~~interpretation wof ~~przypadku~~information ~~niestandardowych~~such ~~czcionek.~~as headers, lists, or text sections.
~~Nieefektywne~~Limited ~~przetwarzanie~~access ~~danych~~to ~~wielostronicowych~~metadata: ~~Algorytmy~~Unlike other formats like JSON, XML, or CSV, PDF files do not contain structured metadata that can be easily analyzed by algorithms. This greatly limits the ability to search and filter information.

Character encoding issues: PDF can store text in different encoding formats, which often causes problems in recognizing certain characters, especially in multilingual documents or when using non-standard fonts.

Inefficient processing of multi-page data: AI ~~mogą~~algorithms ~~mieć~~may ~~trudności~~struggle zto ~~rozpoznawaniem~~recognize ~~kontekstu,~~context ~~jeśli~~when ~~treści~~content sąis ~~podzielone~~spread naacross ~~wiele~~multiple ~~stron.~~pages. NaFor ~~przykład~~example, ~~zdania~~sentences ~~mogą~~may ~~być~~break ~~przerwane~~at nathe ~~końcu~~end ~~jednej~~of ~~strony~~one ipage ~~kontynuowane~~and nacontinue ~~następnej,~~on cothe ~~może~~next, ~~skutkować~~leading ~~błędną~~to ~~interpretacją.~~incorrect interpretations.
~~Brak~~Lack ~~możliwości~~of ~~szybkiej~~efficient iand ~~sprawnej~~quick ~~aktualizacji~~updates: ~~PDF-y~~PDFs sąare ~~zazwyczaj~~generally ~~statyczne,~~static, comaking ~~sprawia,~~them żeunsuitable ~~nie~~for ~~nadają~~dynamic ~~się~~updates doand ~~dynamicznych~~automatic ~~aktualizacji~~data iretrieval. ~~automatycznego~~For ~~zaciągania~~AI, ~~najnowszych~~this ~~danych.~~means Wmanual ~~kontekście AI oznacza~~updates to ~~konieczność~~sources ~~każdorazowej~~are ~~ręcznej~~required ~~aktualizacji~~each ~~źródeł.~~time.
~~Trudności~~Challenges zin ~~rozpoznawaniem~~recognizing ~~obrazów~~images: ~~Często~~PDF ~~PDF-y~~files ~~zawierają~~often ~~tekst~~contain ~~zapisany~~text ~~jako~~stored ~~obrazy,~~as coimages, ~~wymaga~~which ~~dodatkowego~~requires ~~przetwarzania~~additional zprocessing ~~użyciem~~using OCR (Optical Character Recognition), conot ~~nie~~only ~~tylko~~lengthening ~~wydłuża~~the ~~proces~~analysis ~~analizy,~~process ~~ale~~but ~~także~~also ~~może~~potentially ~~generować~~generating ~~błędy,~~errors, ~~zwłaszcza~~especially wwith ~~przypadku~~low-quality ~~niskiej jakości skanów.~~scans.
~~Skomplikowana~~Complicated ~~analiza~~semantic ~~semantyczna~~analysis: AI ~~trudniej~~has ~~jest~~difficulty ~~zrozumieć~~understanding ~~kontekst~~context win ~~plikach~~PDF ~~PDF,~~files ~~gdyż~~since ~~tekst~~the ~~często~~text ~~jest~~is ~~rozmieszczony~~often warranged ~~sposób~~in ~~nieliniowy~~a nonlinear manner (~~np.~~e.g., win ~~kolumnach,~~columns ~~wstawiony~~or winserted ~~ramkach)~~in frames). ~~Może~~This can lead to ~~prowadzić~~misinterpretation doof ~~błędnej~~context, ~~interpretacji~~meaning, ~~kontekstu,~~and ~~sensu~~relationships ibetween ~~zależności~~text ~~pomiędzy fragmentami tekstu.~~fragments.

~~Jest~~This is a problem to ~~problem~~solve doin ~~rozwiązania~~the ~~na przyszłość~~future (~~obecnie~~currently, ~~nawet~~even ~~wykorzystując~~using ~~PDFy~~PDFs, ~~możemy~~we ~~dostaryczć~~can ~~wiele~~deliver ~~wartości~~a zalot ~~pomocą~~of value with AI ~~dla~~for EO). ~~Natomaist~~However, ~~docelowo~~the ~~należałoby~~ultimate ~~wymyśleć~~goal ~~inne~~would ~~rozwiązanie.~~be ~~Potrzebny~~to ~~byłby~~devise ~~jakiś~~a different solution. A document management system ~~zarządzania~~would ~~dokumentami,~~be ~~który~~needed ~~umożłiwiałby~~that ~~tworzenie~~would ~~optymalnej~~allow ~~struktury~~for ~~dla~~the creation of an optimal structure for AI ~~oraz~~and ~~łatwą~~easy ~~aktualizację.~~updates.

PodsumowanieSummary

~~Musimy~~We ~~przede~~primarily ~~wszystkim~~need ~~znaleźć~~to ~~rozwiązanie~~find ~~dla~~a ~~Oficjalnego~~solution ~~Repozytorium~~for ~~Danych~~the EO ~~Global.~~Global ~~Potraktujcie~~Official ~~ten~~Data ~~dokument~~Repository. ~~jako~~Treat ~~otwarcie~~this ~~burzy~~document ~~mózgów~~as ithe wbeginning ~~komentarzach~~of ~~napiszcie~~a ~~pomysły~~brainstorming nasession ~~rozwiązanie~~and ~~tego~~share ~~wyzwania.~~your ideas for addressing this challenge in the comments.