![Die Aktualisierung von Excel PowerQuery dauert etwa 4 Stunden](https://rvso.com/image/1672318/Die%20Aktualisierung%20von%20Excel%20PowerQuery%20dauert%20etwa%204%20Stunden.png)
Ich habe eine Reihe von PowerQueries geschrieben, die die Krankheitshäufigkeit eines Mitarbeiters über 12 Monate berechnet und dabei sein Schichtmuster (möglicherweise nicht Mo-Fr) und seine vertraglich vereinbarten Stunden berücksichtigt.
Es werden die Anzahl der Krankheitstage, die Anzahl der Krankheitsfälle und der Bradford-Score zurückgegeben.
Wenn jemand vor oder nach einem Wochenende krank ist, wird dies als derselbe Krankheitsfall gezählt.
Momentan sind die Rohdaten in zwei CSV-Dateien mit 180.908 Datensätzen enthalten. Pro Woche wachsen sie um ca. 3100 Datensätze – das entspricht den pro Mitarbeiter pro Tag geleisteten Arbeitsstunden.
Die Originaldaten haben 33 Spalten, ich verwende jedoch nur 10 davon. Die anderen Spalten werden als einer der ersten Schritte in den Abfragen gelöscht.
Das Problemist, dass ich es ausgeführt habe und die Aktualisierung etwa vier Stunden gedauert hat... viel zu lang.
Kann irgendjemand irgendwelche Engpässe erkennen, abgesehen von der Tatsache, dass es sich um eine Excel-"Datenbank" handelt und daher keine Abfragefaltung verwendet werden kann? Für
jede Hilfe wäre ich sehr dankbar.
Ein Beispiel für die Daten ist:
Erstellen Sie einige Rohdaten:
- A1:
={"Date","Facility","Department","Status","Forename","Surname","Emp No","Contract","Sick","Wkg Patt"}
- A2:
=SEQUENCE(438,,44296)
(als Datum formatieren) - B2:
=IF(A2#<>"",{"Warehouse","Inbound","Perm","D","BC","1",40})
- Ich 2:
=IF(MID(J2,WEEKDAY(A2,16),1)=0,"",IF(RANDBETWEEN(0,50)<10,RANDBETWEEN(1,8),""))
- und ausfüllen - J2:
=IF(A2#<>"","0011111 " & REPT(CHAR(RANDBETWEEN(65,80)),RANDBETWEEN(0,5)))
- dies stellt das Schichtmuster mit einer Reihe von Nullen und Einsen dar, beginnend an einem Samstag. Der Text am Ende zeigt, dass diese Spalte ursprünglich nicht dafür vorgesehen war, daher kann auch anderer Text enthalten sein.
Speichern Sie die Daten in zwei CSV-Dateien (ich habe vor dem erneuten Speichern einfach „D“, „BC“ und „1“ in der B2-Formel geändert).
Die Excel-Reporting-Datei sollte ebenfalls drei Tabellen enthalten (die für das Beispiel leer gelassen werden können).
Leavers
mit zwei Spalten (Emp No
undName
)NameUpdates
mit zwei Spalten (Emp No
undName
)IgnoreList
mit drei Spalten (Emp No
,Start Date
undEnd Date
).
Es sollten auch zwei benannte Zellen vorhanden sein - StartDate
und EndDate
. Diese Zellen enthalten derzeit17. Juni 22Und18. Juni 21jeweils.
Die Abfragen:
fnGetNamedRange - ruft einen Wert aus einem benannten Bereich ab. Wird verwendet innerhalbRohdatenAbfrage.
let GetNamedRange=(NamedRange) =>
let
name = Excel.CurrentWorkbook(){[Name=NamedRange]}[Content],
value = name{0}[Column1]
in
value
in GetNamedRange
Rohdaten
Diese Abfrage kombiniert die beiden CSV-Dateien, entfernt alle in der Leavers
Tabelle aufgeführten Abgänger, aktualisiert alle falsch geschriebenen Namen in der NameUpdates
Tabelle und filtert die Tabelle nach den erforderlichen Daten.
let
//Import data from the Reporting Suite.
Source = Csv.Document(File.Contents(
"C:\Documents\Raw Data Suite.csv"
),[Delimiter=",", Columns=33, Encoding=1252, QuoteStyle=QuoteStyle.None]),
RemoveTopRows = Table.Skip(Source,0), //Normally 3, set to 0 for example.
ReportingSuitePromoteHeaders = Table.PromoteHeaders(RemoveTopRows, [PromoteAllScalars=true]),
ReportingSuiteRemoveOtherColumns = Table.SelectColumns(ReportingSuitePromoteHeaders,{"Date", "Facility", "Status", "Forename", "Surname", "Emp No", "Contract", "Department", "Sick", "Wkg Patt"}),
//Import data from the Archive.
ArchiveSource = Csv.Document(File.Contents(
"C:\Documents\Archive.csv"
),[Delimiter=",", Columns=33, Encoding=1252, QuoteStyle=QuoteStyle.None]),
ArchivePromoteHeaders = Table.PromoteHeaders(ArchiveSource, [PromoteAllScalars=true]),
ArchiveRemoveOtherColumns = Table.SelectColumns(ArchivePromoteHeaders,{"Date", "Facility", "Status", "Forename", "Surname", "Emp No", "Contract", "Department", "Sick", "Wkg Patt"}),
//Combine the two tables of data.
CombineTables = Table.Combine({ReportingSuiteRemoveOtherColumns, ArchiveRemoveOtherColumns}),
//Join Forename & Surname.
TrimNames = Table.TransformColumns(CombineTables,{{"Forename", Text.Trim, type text}, {"Surname", Text.Trim, type text}}),
JoinEmployeeName = Table.CombineColumns(TrimNames,{"Forename", "Surname"},Combiner.CombineTextByDelimiter(" ", QuoteStyle.None),"Name"),
ChangeNameCase = Table.TransformColumns(JoinEmployeeName,{{"Name", Text.Proper, type text}}),
//Reference Name Updates table.
NameSource = Excel.CurrentWorkbook(){[Name="NameUpdates"]}[Content],
ChangeDataType1 = Table.TransformColumnTypes(NameSource,{{"Emp No", type text}, {"Name", type text}}),
MergeTables = Table.NestedJoin(ChangeNameCase, {"Emp No"}, ChangeDataType1, {"Emp No"}, "NameUpdates", JoinKind.LeftOuter),
ExpandNameUpdates = Table.ExpandTableColumn(MergeTables, "NameUpdates", {"Name"}, {"NameUpdates.Name"}),
AddNewNameColumn = Table.AddColumn(ExpandNameUpdates, "New Name", each if [NameUpdates.Name] = null then [Name] else [NameUpdates.Name], type text),
RemoveOtherColumns = Table.SelectColumns(AddNewNameColumn,{"Date", "Facility", "Status", "New Name", "Emp No", "Contract", "Department", "Sick", "Wkg Patt"}),
RenameColumn = Table.RenameColumns(RemoveOtherColumns,{{"New Name", "Name"}}),
ChangeDataType2 = Table.TransformColumnTypes(RenameColumn,
{{"Date", type date}, {"Facility", type text}, {"Status", type text}, {"Name", type text}, {"Emp No", type text}, {"Contract", type number}, {"Department", type text}, {"Sick", type number}, {"Wkg Patt", type text}}),
//Filter to required dates and select Perm employees who have had a period of sick.
FilterDates = Table.SelectRows(ChangeDataType2, each [Date] >= Date.From(fnGetNamedRange("StartDate")) and [Date] <= Date.From(fnGetNamedRange("EndDate"))),
FilterRows = Table.SelectRows(FilterDates, each ([Status] = "Perm")),
//Reference the Leavers table.
LeaversSource = Excel.CurrentWorkbook(){[Name="Leavers"]}[Content],
RemovedName = Table.SelectColumns(LeaversSource,{"Emp No"}),
ChangeDataType3 = Table.TransformColumnTypes(RemovedName,{{"Emp No", type text}}),
RemoveLeavers = Table.NestedJoin(FilterRows, {"Emp No"}, ChangeDataType3, {"Emp No"}, "Leavers", JoinKind.LeftAnti),
RemoveLeaverColumn = Table.RemoveColumns(RemoveLeavers,{"Leavers"}),
SortTable = Table.Sort(RemoveLeaverColumn,{{"Emp No", Order.Ascending}, {"Date", Order.Ascending}}),
//Indicate which days are work days and which aren't.
GetShiftPattern = Table.AddColumn(SortTable, "Shift Pattern", each Text.Start(Text.From([Wkg Patt], "en-GB"), 7), type text),
IsShiftPattern = Table.AddColumn(GetShiftPattern, "ShiftPatternCheck", each Value.Is(Value.FromText(Text.Start(Text.From([Wkg Patt], "en-GB"), 7)), type number)),
IsWorkDay = Table.AddColumn(IsShiftPattern, "WorkDay", each if [ShiftPatternCheck]=true then Text.Middle([Shift Pattern], Date.DayOfWeek([Date], Day.Saturday), 1) else
if Date.DayOfWeek([Date],Day.Saturday)<=1 then 0 else 1, type text),
ChangeDataType4 = Table.TransformColumnTypes(IsWorkDay,{{"WorkDay", Int64.Type}}),
RemoveShiftPatternColumns = Table.Buffer(Table.RemoveColumns(ChangeDataType4,{"Wkg Patt", "Shift Pattern", "ShiftPatternCheck"})),
//Combine Departments
RemoveOtherColumns1 = Table.SelectColumns(RemoveShiftPatternColumns,{"Emp No", "Department"}),
RemoveDuplicates1 = Table.Distinct(RemoveOtherColumns1),
GroupDepartments = Table.Group(RemoveDuplicates1, {"Emp No"}, {{"Department", each Text.Combine([Department], ", "), type nullable text}}),
AddDepartments = Table.NestedJoin(RemoveShiftPatternColumns, {"Emp No"}, GroupDepartments, {"Emp No"}, "Dept", JoinKind.LeftOuter),
ExpandDepartments = Table.ExpandTableColumn(AddDepartments, "Dept", {"Department"}, {"Department.1"}),
RemoveOtherColumns2 = Table.SelectColumns(ExpandDepartments,{"Date", "Facility", "Status", "Name", "Emp No", "Contract", "Department.1", "Sick", "WorkDay"}),
RenameColumns = Table.Buffer(Table.RenameColumns(RemoveOtherColumns2,{{"Department.1", "Department"}}))
in
RenameColumns
Arbeitsfreie Tage– listet die freien Tage jedes Mitarbeiters basierend auf den Nullen und Einsen in der Spalte „Arbeitsmuster“ auf.
let
Source = #"Raw Data",
RemoveWorkingDays = Table.SelectRows(Source, each ([WorkDay] = 0)),
RemoveOtherColumns = Table.Buffer(Table.SelectColumns(RemoveWorkingDays,{"Date", "Emp No"}))
in
RemoveOtherColumns
fnNetzwerkTage- berechnet die Anzahl der Arbeitstage ohne Feiertage (oder in diesem Fall arbeitsfreie Tage)
let
func = (StartDate as date, EndDate as date, optional Holidays as list, optional StartOfWeek as number) =>
let
// optional StartOfWeek, if empty the week will start on Monday
startOfWeek = if StartOfWeek = null then 1 else StartOfWeek,
// Providing for logic where EndDate is after StartDate
Start = List.Min({StartDate, EndDate}),
End = List.Max({StartDate, EndDate}),
// Switch sign if EndDate is before StartDate
Sign = if EndDate < StartDate then -1 else 1,
// Get list of dates between Start- and EndDate
ListOfDates = List.Dates(Start, Number.From(End-Start)+1,#duration(1,0,0,0)),
// if the optional Holidays parameter is used: Keep only those dates in the list that don't occur in the list of Holidays;
// otherwise continue with previous table
DeleteHolidays = if Holidays = null then ListOfDates else List.Difference(ListOfDates, List.Transform(Holidays, Date.From )),
// Count the number of days (items in the list)
CountDays = List.Count(DeleteHolidays) * Sign
in
CountDays
in
func
fnArbeitsfreie Tage – gibt basierend auf der Abfrage die Daten zurück, an denen ein Mitarbeiter nicht arbeitet Non-work days
.
let
non_work_dates = (SourceTable as table, StartDate as date, EndDate as date, EmpNo as text) =>
let
Source = SourceTable,
FilterToEmpNo = Table.SelectRows(Source, each [Emp No] = EmpNo and [Date] >= StartDate and [Date] <= EndDate),
RemoveEmpNo = Table.SelectColumns(FilterToEmpNo,{"Date"}),
ToList = RemoveEmpNo[Date]
in
ToList
in
non_work_dates
Krankheitszeiten- Identifiziert die Anfangs- und Enddaten jedes Krankheitszeitraums
let
Source = #"Raw Data",
FilterToSick = Table.SelectRows(Source, each ([Sick] <> null)),
//The data is sorted by Employee Number and Date so periods of sickness per employee appear next to each other.
//Add three indexes so data from the previous and next row may be referenced in the current row.
//Index1 starts at 1.
AddIndex1 = Table.AddIndexColumn(FilterToSick, "Index1", 0, 1, Int64.Type),
Index1PK = Table.AddKey(AddIndex1, {"Index1"}, true),
SortByIndex1 = Table.Sort(Index1PK, {"Index1", Order.Ascending}),
//We're only adding the Date & Emp No from the previous & next row to the current row.
//So a table is created containing only those two fields.
//Add Index2 to the JoinTable. This will start at 0
JoinTable = Table.SelectColumns(Index1PK,{"Date", "Emp No"}),
AddIndex2 = Table.AddIndexColumn(JoinTable, "Index2", 1, 1, Int64.Type),
Index2PK = Table.AddKey(AddIndex2, {"Index2"}, true),
SortByIndex2PK = Table.Sort(Index2PK, {"Index2", Order.Ascending}),
//Add Index3 to the JoinTable. This will start at 2
AddIndex3 = Table.AddIndexColumn(JoinTable, "Index3", -1, 1, Int64.Type),
Index3PK = Table.AddKey(AddIndex3, {"Index3"}, true),
SortByIndex3PK = Table.Sort(Index3PK, {"Index3", Order.Ascending}),
//Add the Employee Number and Date from the next row to the current row.
Index2Merge = Table.Join(Table.PrefixColumns(SortByIndex2PK, "Prev"), {"Prev.Index2"}, Table.PrefixColumns(SortByIndex1,"Current"), {"Current.Index1"}, JoinAlgorithm.SortMerge),
SortByDate1 = Table.Sort(Index2Merge,{{"Current.Emp No", Order.Ascending}, {"Current.Date", Order.Ascending}}),
Index3Merge = Table.Join(Table.PrefixColumns(SortByIndex3PK,"Next"), {"Next.Index3"}, SortByDate1, {"Current.Index1"}, JoinAlgorithm.SortMerge),
SortByDate1Again = Table.Sort(Index3Merge,{{"Current.Emp No", Order.Ascending}, {"Current.Date", Order.Ascending}}),
DaysSincePreviousSick = Table.AddColumn(SortByDate1Again, "Days Since Previous", each try fnNetworkDays([Prev.Date],[Current.Date],fnNONWORKDAYS(#"Non-work days",[Prev.Date], [Current.Date], [Current.Emp No]))-1 otherwise null),
DaysToNextSick = Table.AddColumn(DaysSincePreviousSick, "Days To Next", each try fnNetworkDays([Current.Date],[Next.Date],fnNONWORKDAYS(#"Non-work days",[Current.Date], [Next.Date], [Current.Emp No]))-1 otherwise null),
IdentifyStart = Table.AddColumn(DaysToNextSick, "Start", each if [Days Since Previous]<>1 or [Prev.Date]=null or [Current.Emp No]<>[Prev.Emp No] then "Start" else null),
IdentifyEnd = Table.AddColumn(IdentifyStart, "End", each if [Days To Next]<>1 or [Current.Emp No]<>[Next.Emp No] then "End" else null),
RemoveOtherColumns = Table.SelectColumns(IdentifyEnd,{"Current.Date", "Current.Facility", "Current.Status", "Current.Name", "Current.Emp No", "Current.Contract", "Current.Department", "Current.Sick", "Start", "End"}),
RenameColumns= Table.RenameColumns(RemoveOtherColumns,{{"Current.Date", "Date"}, {"Current.Facility", "Facility"}, {"Current.Status", "Status"}, {"Current.Name", "Name"},
{"Current.Emp No", "Emp No"}, {"Current.Contract", "Contract"}, {"Current.Department", "Department"}, {"Current.Sick", "Sick"}})
in
RenameColumns
fnSumBetween - summiert die Krankheitsstunden zwischen zwei Daten für eine angegebene Mitarbeiternummer.
let
Sum_Between = (SourceTable as table, StartDate as date, EndDate as date, EmpNo as text) =>
let
Source = SourceTable,
FilterToData = Table.SelectRows(Source, each [Emp No] = EmpNo and [Date] >= StartDate and [Date] <= EndDate),
GroupRows = Table.Group(FilterToData, {"Emp No"}, {{"Sick Hours", each List.Sum([Sick]), type nullable number}})
in
GroupRows
in
Sum_Between
Gesamte Krankheitszeiten - gibt für jeden Krankheitszeitraum eine einzelne Zeile zusammen mit der Anzahl der Krankheitstage zurück.
let
Source = #"Periods of Sick",
MergeStartAndEnd = Table.CombineColumns(Source,{"Start", "End"},Combiner.CombineTextByDelimiter("", QuoteStyle.None),"SickPeriods"),
//Add multi-day sick.
FilterRows1 = Table.SelectRows(MergeStartAndEnd, each ([SickPeriods] = "End" or [SickPeriods] = "Start")),
AddIndex1 = Table.AddIndexColumn(FilterRows1, "Index1", 0, 1, Int64.Type),
AddIndex2 = Table.AddIndexColumn(AddIndex1, "Index2", 1, 1, Int64.Type),
MergeIndexes = Table.Join(AddIndex2, "Index1", Table.PrefixColumns(AddIndex2,"1"), "1.Index2", JoinAlgorithm.SortMerge),
FilterRows2 = Table.SelectRows(MergeIndexes, each ([SickPeriods] = "End")),
RemoveOtherColumns1 = Table.SelectColumns(FilterRows2,
{"1.Date", "Date", "Facility", "Status", "Name", "Emp No", "Contract", "Department"}),
RenameColumns1 = Table.RenameColumns(RemoveOtherColumns1,{{"1.Date", "From"}, {"Date", "To"}}),
AddSickHours = Table.AddColumn(RenameColumns1, "Sick", each fnSumBetween(#"Periods of Sick",[From],[To],[Emp No])),
ExpandTable = Table.ExpandTableColumn(AddSickHours, "Sick", {"Sick Hours"}, {"Sick Hours"}),
//Add single day sick periods.
FilterRows3 = Table.SelectRows(MergeStartAndEnd, each ([SickPeriods] = "StartEnd")),
DuplicateDateColumn = Table.DuplicateColumn(FilterRows3, "Date", "To"),
RemoveOtherColumns2 = Table.SelectColumns(DuplicateDateColumn,{"Date", "To", "Facility", "Status", "Name", "Emp No", "Contract", "Department", "Sick"}),
RenameColumns2 = Table.RenameColumns(RemoveOtherColumns2,{{"Date", "From"}, {"Sick", "Sick Hours"}}),
//Join single and multi day sick periods.
AppendData = Table.Combine({ExpandTable, RenameColumns2}),
SortRows = Table.Sort(AppendData,{{"Emp No", Order.Ascending}, {"From", Order.Ascending}}),
CalculateDaysSick = Table.AddColumn(SortRows, "Days Sick", each Number.Round([Sick Hours]/([Contract]/5),2))
in
CalculateDaysSick
Bradford-Ergebnis - gibt die Gesamtzahl der Krankheitstage und -vorfälle zurück, zusammen mit derBradford-Ergebnis
let
Source = #"Total Periods of Sick",
GroupRows = Table.Group(Source, {"Emp No", "Name", "Department", "Facility"}, {{"Total Days", each List.Sum([Days Sick]), type number}, {"Occurrences", each Table.RowCount(_), Int64.Type}}),
AddBradfordScore = Table.AddColumn(GroupRows, "Bradford Score", each Number.Power([Occurrences],2)*[Total Days])
in
AddBradfordScore
Antwort1
Sie verwenden das falsche Werkzeug, als ob Sie mit einem Goldschmiedehammer einen Eisenbahnnagel einschlagen wollten. Es gibtvielerelationale Datenbanken verfügbar, darunter kostenloseLibreOffice Base, Undviele andere. FürBaseund viele andere Datenbanken, dieCSV-Dateien können in eine neue Datenbank importiert werdenmit wenig Aufwand.
Es kann eine Weile dauern, bisLernen Sie die Grundlagen einer Datenbankabfragesprache, wie eine Version von SQL, aber es wird Ihre Aufgabeviel einfacherund erreichbarer.